Macrometacorp · MarkoMacrometa · Jan 27, 2023 · Jan 30, 2023 · Jan 30, 2023 · Jan 31, 2023
diff --git a/docs/best-practice/_category_.json b/docs/best-practice/_category_.json
@@ -0,0 +1,4 @@
+{
+  "label": "Best Practice",
-  "label": "Best Practice",
+  "label": "Best Practices",
-  "label": "Best Practice",
+  "label": "Best Practices",
+  "position": 125
+}
diff --git a/docs/best-practice/limit-and-offset-vs-cursor-api-and-hasmore.md b/docs/best-practice/limit-and-offset-vs-cursor-api-and-hasmore.md
@@ -0,0 +1,166 @@
+---
+sidebar_position: 50
+title: LIMIT and OFFSET vs Cursor API and hasMore
+---
+
+Queries that return large volumes of data may require more than one response to provide the complete results. A common method is using the `LIMIT` and `OFFSET` statements in the query. However, this is not the optimized approach. When these options are used, each time the query is invoked query processing is performed. 
+
+A better approach is to use the `cursor` API.  The response from the cursor API call contains a boolean attribute, `hasMore`. If `hasMore` is true, the next batch of records will be ready on the server. In subsequent calls, the records are returned from the last position. 
+
+The `batchSize` parameter can be set in the `cursor` API to configure the number of documents returned per request. The `batchSize` default is `100` and the maximum value is `1000`. 
+
+**Query with `LIMIT` and `OFFSET`**
+
+```sql
+FOR car in Cars
+FILTER car.type == 'SUV'
+SORT car._key DESC
+LIMIT 0, 3
+RETURN car
+```
+
+**Query without `LIMIT` and `OFFSET`**
+
+```sql
+FOR car in Cars
+FILTER car.type == 'SUV'
+SORT car._key DESC
+RETURN car
+```
+
+**Cursor API Request Example (create cursor)**
+
+The request body has attributes for `batchSize`, `bindVars`, `count`,  `query`, and `options`. The `options` attribute can receive several different key/value pairs. To utilize `hasMore` we will need to include the `stream` attribute set to `true`. Note the default value for `stream` is `false`.
+
+```
+curl -X 'POST' \
+  'https://api-gdn.paas.macrometa.io/_fabric/_system/_api/cursor' \
+  -H 'accept: application/json' \
+  -H 'Content-Type: application/json' \
+  -H 'Authorization: bearer <token>' \
+	-d '{
+  "batchSize": 100,
+  "bindVars": {},
+  "options": {
+    "stream": true
+  },
+  "query": "FOR car in Cars FILTER car.type == '\''SUV'\'' SORT car._key DESC RETURN car",
+  "ttl": 30
+}'
+```
+
+**Cursor API Response Example (create cursor)**
+
+The response from the Cursor API request will contain several attributes. The most important to this example are `hasMore` and `id`. The `id` identifies the cursor during subsequent requests and `hasMore` is a boolean value to indicate whether there are more results to be retrieved.
+
+```
+{
+  "result": [
+     {
+        {
+    "_id": "Cars/377189715",
+    "_key": "377189715",
+    "_rev": "_eWFT8Eu--_",
+    "customer_id": 994,
+    "make": "Jeep",
+    "model": "Wagoneer",
+    "type": "SUV",
+    "year": 2022
+  },
+     ...
+	     {
+    "_id": "Cars/377187243",
+    "_key": "377187243",
+    "_rev": "_eWFTXbS--_",
+    "customer_id": 890,
+    "make": "Volkswagen",
+    "model": "Atlas",
+    "type": "SUV",
+    "year": 2021
+  }
+  ],
+  "hasMore": true, // shows if there are more results
+  "id": "463970894", // identifies the cursor to return the next batch
+  "count": 195,
+  "extra": {
+    "stats": {
+      "writesExecuted": 0,
+      "writesIgnored": 0,
+      "scannedFull": 0,
+      "scannedIndex": 195,
+      "filtered": 0,
+      "httpRequests": 0,
+      "executionTime": 0.0004,
+      "peakMemoryUsage": 100
+    },
+    "warnings": []
+  },
+  "cached": false,
+  "error": false,
+  "code": 201
+}
+```
+
+**Cursor API Request Example (read next batch)**
+
+```bash
+curl -X PUT "https://api-gdn.pass.macrometa.io/_fabric/_system/_api/cursor/463970894"                                                     \
+-H "Authorization: bearer <token>"
+```
+
+**Cursor API Response Example (read next batch)**
+
+When the `hasMore` value is false there are no further results to return from the server.
+
+```json
+{
+  "result": [
+     {
+    "_id": "Cars/377176738",
+    "_key": "377176738",
+    "_rev": "_eWFQ7di--_",
+    "customer_id": 345,
+    "make": "Toyota",
+    "model": "RAV4",
+    "type": "SUV",
+    "year": 2022
+  },
+     ...
+     {
+    "_id": "Cars/349446110",
+    "_key": "349446110",
+    "_rev": "_eWFB8OW--_",
+    "customer_id": 123,
+    "make": "Audi",
+    "model": "Q5",
+    "type": "SUV",
+    "year": 2019
+  },
+  ],
+  "hasMore": false, // when false no more results can be returned
-  "hasMore": false, // when false no more results can be returned
+  "hasMore": false, // When false, no more results can be returned
-  "hasMore": false, // when false no more results can be returned
+  "hasMore": false, // When false, no more results can be returned
+  "id": "463970894", // same cursor id from the initial response
-  "id": "463970894", // same cursor id from the initial response
+  "id": "463970894", // Same cursor ID from the initial response
-  "id": "463970894", // same cursor id from the initial response
+  "id": "463970894", // Same cursor ID from the initial response
+  "count": 195,
+  "extra": {
+    "stats": {
+      "writesExecuted": 0,
+      "writesIgnored": 0,
+      "scannedFull": 0,
+      "scannedIndex": 195,
+      "filtered": 0,
+      "httpRequests": 0,
+      "executionTime": 0.0004,
+      "peakMemoryUsage": 100
+    },
+    "warnings": []
+  },
+  "cached": false,
+  "error": false,
+  "code": 201
+}
+```
+
+API Reference Docs
+
+[Create Query Cursor](https://macrometa.com/docs/api#/operations/createQueryCursor)
+
+[Modify Query Cursor](https://macrometa.com/docs/api#/operations/modifyQueryCursor)
diff --git a/docs/best-practice/multiple-collections-vs-single-large-collection.md b/docs/best-practice/multiple-collections-vs-single-large-collection.md
@@ -0,0 +1,113 @@
+---
+sidebar_position: 50
+title: Multiple collections vs single large collection
+---
+
+Query performance is linked, in part, to the number of documents in the collections and the indexes used. When a single collection contains a large number of complex documents optimizing for performance becomes difficult. Designing collections around purpose-built documents and indexes for returning specific results makes query writing simpler and improves performance.
+
+In this example, we have a single collection, `Garage`. It contains `Account`, `Cars`, `Orders`, and `Staff` attributes with further nested attributes. This makes query writing and indexing difficult. Here is an example document for the `Garage` collection.
+
+```
+{
+    "_id": "Garage/349351645",
+    "_key": "349351645",
+    "_rev": "_eUgrDn2--_",
+    "account": {
+      "first_name": "John",
+      "id": 123,
+      "joined_date": "2022-01-01",
+      "last_name": "Doe",
+      "phone": "555-555-5555"
+    },
+    "cars": {
+      "car_a": {
+        "make": "Audi",
+        "model": "Q5",
+        "year": "2019"
+      },
+      "car_b": {
+        "make": "Ford",
+        "model": "F-150",
+        "year": "2021"
+      }
+    },
+    "orders": {
+      "account_id": 123,
+      "car_id": "car_b",
+      "customer_phone": "555-555-5555",
+      "date": "2022-03-14",
+      "invoice_number": 456,
+      "price": "$100.00"
+    },
+    "staff": {
+      "first_name": "Jane",
+      "last_name": "Smith",
+      "tech_id": 789
+    }
+  }
+```
+
+The next example shows how one might structure documents inside of individual collections. This approach can help in creating indexes on correct attributes in each collection and reduce record scan count.
+
+```
+//Account Document
+{
+    "_id": "Accounts/349491803",
+    "_key": "349491803",
+    "_rev": "_eUhBHmi--_",
+    "car_ids": [
+      "Cars/349434363",
+      "Cars/349446110"
+    ],
+    "first_name": "John",
+    "id": 123,
+    "joined_date": "2022-01-01",
+    "last_name": "Doe",
+    "phone": "555-555-5555"
+  }
+
+//Car Document
+{
+    "_id": "Cars/349446110",
+    "_key": "349446110",
+    "_rev": "_eUg1Tl6--_",
+    "customer_id": 123,
+    "make": "Audi",
+    "model": "Q5",
+    "year": 2019
+  },
+  {
+    "_id": "Cars/349434363",
+    "_key": "349434363",
+    "_rev": "_eUg1fJe--_",
+    "customer_id": 123,
+    "make": "Ford",
+    "model": "F-150",
+    "year": 2021
+  }
+
+// Order Document
+{
+    "_id": "Orders/349454643",
+    "_key": "349454643",
+    "_rev": "_eUg9dXS--_",
+    "account_id": 123,
+    "car_ids": [
+      "Cars/349446110"
+    ],
+    "date": "2022-03-14",
+    "invoice_number": 456,
+    "price": "$100.00",
+    "staff_id": 789
+  }
+
+// Staff Document
+{
+    "_id": "Staff/349422825",
+    "_key": "349422825",
+    "_rev": "_eUgvNOW--_",
+    "first_name": "Jane",
+    "last_name": "Smith",
+    "tech_id": 789
+  }
+  ```
diff --git a/docs/best-practice/use-o-indexes-for-collect-operation.md b/docs/best-practice/use-o-indexes-for-collect-operation.md
@@ -0,0 +1,14 @@
+---
+sidebar_position: 50
+title: Use of indexes for COLLECT operation
+---
+
+If there is a `COLLECT` operation in the query, the records with similar attribute values are grouped.  Persistent index on the attribute value on which `COLLECT` operation is performed helps to optimize the query. In the following example, the persistent index on the `country` attribute will help to optimize the query.
-If there is a `COLLECT` operation in the query, the records with similar attribute values are grouped.  Persistent index on the attribute value on which `COLLECT` operation is performed helps to optimize the query. In the following example, the persistent index on the `country` attribute will help to optimize the query.
+If there is a `COLLECT` operation in the query, then the records with similar attribute values are grouped.  Persistent indexes on the attribute value on which `COLLECT` operation is performed helps to optimize the query. In the following example, the persistent index on the `country` attribute helps to optimize the query.
-If there is a `COLLECT` operation in the query, the records with similar attribute values are grouped.  Persistent index on the attribute value on which `COLLECT` operation is performed helps to optimize the query. In the following example, the persistent index on the `country` attribute will help to optimize the query.
+If there is a `COLLECT` operation in the query, then the records with similar attribute values are grouped.  Persistent indexes on the attribute value on which `COLLECT` operation is performed helps to optimize the query. In the following example, the persistent index on the `country` attribute helps to optimize the query.
+
+```
+FOR p IN players
+  COLLECT country = p.country
+  RETURN {
+    "country" : country
+  }
+```
diff --git a/docs/best-practice/use-of-composite-index.md b/docs/best-practice/use-of-composite-index.md
@@ -0,0 +1,6 @@
+---
+sidebar_position: 50
+title: Use of composite index
+---
+
+If there are multiple attributes used in `FILTER` criteria, it’s recommended to create a composite index with all the attributes. For e.g, if there are `3` attributes used in `FILTER`, the `composite index` created on these 3 attributes will give better query performance than `3` separate indexes.
diff --git a/docs/best-practice/use-of-search-for-array-attributes.md b/docs/best-practice/use-of-search-for-array-attributes.md
@@ -0,0 +1,28 @@
+---
+sidebar_position: 50
+title: Use of SEARCH for array attributes
+---
+
+If the user wants to `FILTER` against an array of values the `ALL`, `ANY`, and `NONE` operators are used. Array indexes would not help because those are not utilized. Users can create `SEARCH VIEW` to optimize these queries.
+
+To filter attributes against an array of values you would commonly use the array comparison operators, `ALL`, `ANY`, or `NOT`, as a prefix in conjunction with the common comparison operator `IN`. However, this is not an optimized approach and will not utilize any indexes.
+
+The optimized approach used the `SEARCH` feature. An index is created on the attributes defined in the search view. You can read more about `SEARCH` and search views here, [search](https://macrometa.com/docs/search/search).
+
+```
+/* Query on a collection with FILTER */
+
+LET carMakes = ["Ford", "Audi", "Mazda"]
+   FOR car in cars
+       FILTER car.make ANY IN carMakes
+       FILTER car.type == "SUV"
+   RETURN { car : car}
+
+/* Query on Search view with SEARCH */
+/* Search VIEW is created with the required attributes. */
+
+LET carMakes = ["Ford", "Audi", "Mazda"]
+   FOR car in CARS_VIEW
+     SEARCH ANALYZER(car.make ANY IN carMakes), "identity")
+	 RETURN car
+```
diff --git a/docs/best-practice/use-of-search-for-sort-operations.md b/docs/best-practice/use-of-search-for-sort-operations.md
@@ -0,0 +1,28 @@
+---
+sidebar_position: 50
+title: Use of SEARCH for SORT operations
+---
+
+Due to known limitations, if `SORT` operation is specified in the query, indexes are not used for attributes specified in `FILTER` part. The alternative to this is to create a `SEARCH VIEW` with the required attributes. The attribute on which sort need to be done, use it as a primary sort attribute in the `SEARCH VIEW` 
+Note: Only `1` attribute can be added as a `Primary Sort` attribute
+```
+FOR city in cities
+   FILTER city.continent == "ASIA" AND
+          city.country == "CHINA" AND
+          city.type == "RURAL" AND
+          city.population > 40000
+   SORT city.population DESC     
+   return { city : city}
+
+/* 
+ * Query on Search view with SEARCH 
+ * Search VIEW is created with the required attributes.
+ * Add PrimarySort with the required attribute and order
+ */
+FOR city in CITIES_VIEW
+   SEARCH ANALYZER(city.continent == "ASIA" AND
+          city.country == "CHINA" AND
+          city.type == "RURAL" AND
+          city.population > 40000 ), "identity")
+   return { city : city}
+```
diff --git a/...est-practice/use-of-stream-worker-for-optimization-of-reporting-related-jobs.md b/...est-practice/use-of-stream-worker-for-optimization-of-reporting-related-jobs.md
@@ -0,0 +1,7 @@
+---
+sidebar_position: 50
+title: Use of Stream Worker for optimization of reporting-related jobs
+---
+
+For example, there is a scheduled reporting job at the end of the week on a collection with millions of records. In the report, it is expected to have records for each day of the week. It is not efficient to run the query on that big collection to get the data for all seven days.  To tackle this a `Stream Worker` can be used. A `Stream worker` can process data on the `Stream` associated with the collection. It can analyze it and generate the `staged` data and can store data in some `CACHE` collection.
+E.g. Get the number of the `GET` requests each day from each `IP Address` Instead of scanning the huge `ACCESS LOG` collections, the `Stream worker` can analyze and store data in `CACHE` collection with `user information`, a number of `GET` requests, `Date`, `User name` As there are fewer records compare to that big `ACCESS LOG` collection in `CACHE` collection, query execution would be faster.