Skip to content

Support indexless|hierarchical generic content part declaration patterns. #1

@faerietree

Description

@faerietree

Definitions

Indexless := no index numbering scheme, i.e. if a number occurs then it is either content or denoting a hierarchy in a markup and not a series. => numbers are explicit (no regular expression) => can only have an implicit ordering.
indexed := with index numbering scheme (i.e. explicite order)

Generic := filter by an expression (regex|wildcard|...)
Specific := explicit := filter by explicit content (repeating phrase)
Raw content := markup content
Content := plain text.content, i.e. the visual content like information text, media, ...

Content part declarations

  • [Content phrase filter] (matching all specific content parts within all hierarchy levels mixed)

    • generic|regex|wildcard

      • can match index (have an explicit series order)

        • all|mixed

          building[ ]*([\\d][ .->_]*)+:
          Solution[ ]*([\\d][ .->_]*)+:
          Exercise[ ]*([\\d][ .->_]*)+:
          Teacher[ ]*([\\d][ .->_]*)+:
          Text[ ]*([\\d][ .->_]*)+:
          
        • numbers+special chars only (have an explicit series order, hierarchy e.g. 1.1, 1.2, 2.1, ...)

          ([\\d].)+
          ([\\d])+
          

          Note: Number based index may need filtering of false positives due to numbers occuring in the content parts, too.)

      • guaranteed indexless

        Exercise[ ]*: // generic|regex|wildcard
        ...
        
    • specific|explicit (guaranteed indexless)

      • all|mixed
        Exercise:
        Teacher:
        Teacher (Physics):
        Teacher Physics:
        Structures :
        Structures:
        Text:
        Mission to achieve: // phrase is a sentence|clause
        
      • numbers+special chars only (e.g. 1.1:, 1.1:, 1.1:, ...)
        3:
        1.2:
        800:
        
  • [Raw content | Markup phrase filter]

    • specific|explicit, match only indexless (no order; numbers denote hierarchy depth; Matches only within one hierarchy level)

      #
      ##
      ---
      ===
      h1
      h2
      h:p level='1'
      header level="1"
      header level="2"
      ...
      
    • generic, can match index (Matching all series within all hierachy levels in one pass! [1])
      Note: This is the default case for XML base file formats. It requires keeping track of hierarchy depth counting in code because a node has no number attached! It can however have a style attached denoting depth.

      #+
      header[\\d]*
      h[\\d]*
      section
      
  • [Mixed: Markup & Content phrase filters]

    • specific, mostly indexless (matching all within one hierarchy level with a content filter:)
      #Breaking:
      # Tex
      # general information
      ## specific information
      
    • generic, match only indexless (Matching all series within all hierachy levels in one pass! [1])
      Note: For all XML base file formats this merged pattern is easier to achieve via postprocessing the respective content part's head after employing the generic, indexless filter.
      #+(^[<][\\w][>])*[Tt]ext
      #+[ ]?[Gg]eneral information
      

[1] Only of limited use as higher level elements have no content part if following strict sectioning. what remains is only the declaration unless there is summary|description content between e.g. 1. and its subsection 1.1 .

Purpose

They are essential for the worlddevelopment civilization editor, open bookkeeper bot, ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions