mQuery - Query for maps

  • 'var' name = map
  • 'upd' name = map
  • bag 'map' name ':' map
  • bag 'where' name ':' bool
  • bag 'group' map
  • bag 'agg' map 'by' map 'to' tuple

person p group name map var c->0 agg c->c+1 to name, c

path

  • U = Universum ncluding Bool, Int, Strings ...
  • Map: U ⇒ U also in U (least fixpoint...)

path p:

  • p.len ∈ ℕ
  • p[k] ∈ Map for k ∈ [0, p.len]
  • p.key[k] ∈ Keys for k ∈ [0, p.len-1]
  • p[k][p.key[k]] = p[k+1] for k ∈ [0, p.len-1]

operator on set of paths:

  • σ: selection: σe with e a boolean expression on a single path
  • τ extension: all paths extended by 1
  • ? * +: multiplier on sequence of operators
  • ( ) grouping of operators

multiplier

  • {x ∈ P.pe | be} P a set of paths, b boolean expressions over path xe
  • pe = empty | pe . re | ( pe ) | pe (* | + ?)

map path

we operate on set of nodes (duplicate removal for recursion)

a path is a regular expression with two addition special chars (not included in . and can be escaped with \)

  • > map key -> value node operator
  • < inverse > operator (all nodes that point to)
  • in fact we replaced recursion with a generalized version of Kleene Star
  • maybe a general inverse path operator could be helpfull

Achtung:

  • < Operator brint Set zurück, müssten map als definieren als U ↪ 2U definieren (oder U → 2U und {} translates to undefined)
  • Projektion funktioniert nicht, weil wir dadurch neue Maps erstellen und darauf wieder bag/map Operation anwenden müsssen

path: v0, k0, v1, k1, ..., vn with vi+1 = vi[ki], operators

  • e: P → 2U: e(P) = {vn | (v0, ..., vn) ∈ P}
  • > k forward: P>k = {(v0, k0, ..., vn, k', vi[k']) | (v0, k0, ..., vn) ∈ P, k' matches k and ∃ vi[k']}
  • [] selection: P[π] = {p ∈ P | π(P)}
  • * multiplicity P q * 0 = P and P q * (m+1) = (P q) * m typically not exact multiplies but ranges P q * [m', m"] = P q * m' ∪ P q * (m'+1) ∪ ...

  • B = Bool ∪ Int ∪ String
  • U = B ∪ Map (least Fixpoint)
  • Map = U → U
  • P = U (K U)* # Path: sequence of U, such that k follows a Map , such that for each u k v holds u is a map an u[k] = velement of B
  • PS = 2P #set of paths
  • e expression evaluates to PS
  • e = 2U # lift a set of U to paths of length 1
  • e > r = {p k v | p ∈ e and k matches r]

Hierarchy

B = Basic Values: Bool, Int, Strings ... U = Universe

Relational Model Codd

  • U = B*
  • Bag(U) ∉ U
  1. _k: extract element from tuple
  2. σ: selection
  3. π: projection
  4. bag operations: union, intersection, diff, cardinal product
  5. syntactical sugar: different joins, typically selection from cardinal product
  6. oder, row_number, group by, aggregation

Relational with records / structure

  • U = B* ∪ U*
  • Bag(U) ∉ U
  1. (...): nested make tuple
  2. _ cat _: concatenate two tuples

Relational with muli- & entity-tvalued attributes

  • U = B* ∪ U* ∪ Bag(U)
  1. bag2tuple: count, ∃, ∀, arbitraryOneOff, one (Fail if count <> 1)

fundamentals

  • U = Bool ∪ Nat ∪ Number ∪ String ∪ Map : Universal = Set of all values (maybe als Fun should be included?)
  • Map:
    1. {m | m: U ↪ U} partial functions (not relation!)
    2. {o | [0..len(m)-1] ↣ keys(m) } order/iterator of m. separate from m, or inbuild to o should be injective, no duplicate keys!, maybe index is too much, if base map changes, we should only have a sequence - however with transaction logic that is not really necessary
    3. sorted map: may be used for indexes, or alternatively only within query processing
    4. Arr: {m ∈ Map | m: [0..k] → U}

questions

  • set versus bags, functions versus relations

variables

  • each loop operator opens a new scope with (in the basic language) two variables value and key for the current element of the map. plus parent and src

Expression e

expressionresultvariables in loopcomment
[] | [ e (, e)* ]arr 
[] | [ e ↦ e (, e ↦ e)* ]map 
a app earr → arr 
a poparr → arr 
m add e ↦ emap → map 
m put e ↦ emap → map 
m put e ↦ emap → map sugar for m where key != e add e ↦ e
m where emapval key src
m rem emap → map sugar for m where key != e
m rem ↦ emap → map sugar for m where val != e
m where emapval key src
m map (e →)? emapval key src
m list earrval key src
m sort emSortval key src
m group emap of mapval key src{j ↦ {k ↦ m[k]} | j = e(m[k], k)}
m agg e0 by eUval key act srcu = un; u0= e0; ui+1= e(ui, vi, ki) with ∃ n, m = {ki ↦ vi | i=0..n-1}
m unnest  {[k, j] ↦ v | m[k][j] ∈ m}

examples

  • sum of prices by grp: x group val.grp map agg 0 by act+val.price

existing operators ....

bag

  • bag operators (union, intersection, difference, cartesion product): restrictions for compatibility (column names), multi?set etc.
  • Π projection: Πa1a2...(R): Auswahl von Attributen ai. multi?set
  • σ selection σφ(R): selection of rows
  • ρ rename ρa/b(R) rename attribute b ↦ a
  • joins
    • ⋈ natural join: R ⋈ S: join with equal on common attribute names (and project away one attribute of each pair)
      • if we consider a tuple as a relation ⊆ Attributes × Values
      • Fun(R) = relation R is a function then
      • R ⋈ S = {r ∪ s | r ∈ R and s ∈ S and Fun(r &cpu; s)}
    • Θ Θ-join: R ⋈Θ S: join with boolean condition Θ
    • Semi-Join or Restriction: natural join but project R rsp. S away
    • ⊳ or ⊲ Anti-Join: R &vrtri S = those tuples of R, that do not have a naturaljoin into S
    • Division: R ÷ S: {q | q × S ∈ R}, thus, Attributes of S should be a susbset of Attributes of R. The Division are those tuples of attributes in R only, that have an equijoin to each tuple in S.
    • left outer join: R ⊐⋈ S = natural join R and S plus all tuples from R, expanded with nulls (for the attributes of S) that have no naturalJoin
    • right and full out join similar
  • aggregation: G1,G2, ..., Gm g f1(A1'),f2(A2'), ..., f(Ak')(R): Grouping over G1, ..., Gm (or over whole R), and aggregation functions

Set

  • transitive closure R+ : R ⊆ R+ and &all;xyz: if (x,y) ∈ R+ and (y,z) ∈ R+ then (x,z) ∈ R+ for a binary relation R. Least Fixpoint Semantics (works also with bag, but needs more care with stopping condition)
    • use with common table expression for mutual recursion (maybe add unique constraints, to improve stop condition ...)
    • or functions

Map (functional relations, i.e. keys are set)

  • merge
    • if same key then
      • fail
      • assert values are equal (run- or compiletime?)
      • take first value
      • rename
      • aggregate values

XPath

  • sequences (of items = atomic values or nodes = element attribute text namespace processing-instruction comment document-node)
  • axis: parent::, child::, ancestor::, descendant::, preceding-sibling ....
  • / path operator: for each in left sequence evaluate right. Each non-initial occurrence of // is effectively replaced by /descendant-or-self::node()/
  • [ ] predicate: [2] position in sequence, @id='ab' ....
  • inbuilt functions like string-join ....
  • let $x := ... , ... return ...
  • for $x in ..., ... return ...
  • functions, arrays, maps ...