Start adding telemetry
[akkoma] / docs / docs / configuration / search.md
1 # Configuring search
2
3 {! administration/CLI_tasks/general_cli_task_info.include !}
4
5 ## Built-in search
6
7 To use built-in search that has no external dependencies, set the search module to `Pleroma.Activity`:
8
9 > config :pleroma, Pleroma.Search, module: Pleroma.Search.DatabaseSearch
10
11 While it has no external dependencies, it has problems with performance and relevancy.
12
13 ## Meilisearch
14
15 Note that it's quite a bit more memory hungry than PostgreSQL (around 4-5G for ~1.2 million
16 posts while idle and up to 7G while indexing initially). The disk usage for this additional index is also
17 around 4 gigabytes. Like [RUM](./cheatsheet.md#rum-indexing-for-full-text-search) indexes, it offers considerably
18 higher performance and ordering by timestamp in a reasonable amount of time.
19 Additionally, the search results seem to be more accurate.
20
21 Due to high memory usage, it may be best to set it up on a different machine, if running pleroma on a low-resource
22 computer, and use private key authentication to secure the remote search instance.
23
24 To use [meilisearch](https://www.meilisearch.com/), set the search module to `Pleroma.Search.Meilisearch`:
25
26 > config :pleroma, Pleroma.Search, module: Pleroma.Search.Meilisearch
27
28 You then need to set the address of the meilisearch instance, and optionally the private key for authentication. You might
29 also want to change the `initial_indexing_chunk_size` to be smaller if you're server is not very powerful, but not higher than `100_000`,
30 because meilisearch will refuse to process it if it's too big. However, in general you want this to be as big as possible, because meilisearch
31 indexes faster when it can process many posts in a single batch.
32
33 > config :pleroma, Pleroma.Search.Meilisearch,
34 > url: "http://127.0.0.1:7700/",
35 > private_key: "private key",
36 > initial_indexing_chunk_size: 100_000
37
38 Information about setting up meilisearch can be found in the
39 [official documentation](https://docs.meilisearch.com/learn/getting_started/installation.html).
40 You probably want to start it with `MEILI_NO_ANALYTICS=true` environment variable to disable analytics.
41 At least version 0.25.0 is required, but you are strongly adviced to use at least 0.26.0, as it introduces
42 the `--enable-auto-batching` option which drastically improves performance. Without this option, the search
43 is hardly usable on a somewhat big instance.
44
45 ### Private key authentication (optional)
46
47 To set the private key, use the `MEILI_MASTER_KEY` environment variable when starting. After setting the _master key_,
48 you have to get the _private key_, which is actually used for authentication.
49
50 === "OTP"
51 ```sh
52 ./bin/pleroma_ctl search.meilisearch show-keys <your master key here>
53 ```
54
55 === "From Source"
56 ```sh
57 mix pleroma.search.meilisearch show-keys <your master key here>
58 ```
59
60 You will see a "Default Admin API Key", this is the key you actually put into your configuration file.
61
62 ### Initial indexing
63
64 After setting up the configuration, you'll want to index all of your already existsing posts. Only public posts are indexed. You'll only
65 have to do it one time, but it might take a while, depending on the amount of posts your instance has seen. This is also a fairly RAM
66 consuming process for `meilisearch`, and it will take a lot of RAM when running if you have a lot of posts (seems to be around 5G for ~1.2
67 million posts while idle and up to 7G while indexing initially, but your experience may be different).
68
69 The sequence of actions is as follows:
70
71 1. First, change the configuration to use `Pleroma.Search.Meilisearch` as the search backend
72 2. Restart your instance, at this point it can be used while the search indexing is running, though search won't return anything
73 3. Start the initial indexing process (as described below with `index`),
74 and wait until the task says it sent everything from the database to index
75 4. Wait until everything is actually indexed (by checking with `stats` as described below),
76 at this point you don't have to do anything, just wait a while.
77
78 To start the initial indexing, run the `index` command:
79
80 === "OTP"
81 ```sh
82 ./bin/pleroma_ctl search.meilisearch index
83 ```
84
85 === "From Source"
86 ```sh
87 mix pleroma.search.meilisearch index
88 ```
89
90 This will show you the total amount of posts to index, and then show you the amount of posts indexed currently, until the numbers eventually
91 become the same. The posts are indexed in big batches and meilisearch will take some time to actually index them, even after you have
92 inserted all the posts into it. Depending on the amount of posts, this may be as long as several hours. To get information about the status
93 of indexing and how many posts have actually been indexed, use the `stats` command:
94
95 === "OTP"
96 ```sh
97 ./bin/pleroma_ctl search.meilisearch stats
98 ```
99
100 === "From Source"
101 ```sh
102 mix pleroma.search.meilisearch stats
103 ```
104
105 ### Clearing the index
106
107 In case you need to clear the index (for example, to re-index from scratch, if that needs to happen for some reason), you can
108 use the `clear` command:
109
110 === "OTP"
111 ```sh
112 ./bin/pleroma_ctl search.meilisearch clear
113 ```
114
115 === "From Source"
116 ```sh
117 mix pleroma.search.meilisearch clear
118 ```
119
120 This will clear **all** the posts from the search index. Note, that deleted posts are also removed from index by the instance itself, so
121 there is no need to actually clear the whole index, unless you want **all** of it gone. That said, the index does not hold any information
122 that cannot be re-created from the database, it should also generally be a lot smaller than the size of your database. Still, the size
123 depends on the amount of text in posts.
124
125 ## Elasticsearch
126
127 **Note: This requires at least ElasticSearch 7**
128
129 As with meilisearch, this can be rather memory-hungry, but it is very good at what it does.
130
131 To use [elasticsearch](https://www.elastic.co/), set the search module to `Pleroma.Search.Elasticsearch`:
132
133 > config :pleroma, Pleroma.Search, module: Pleroma.Search.Elasticsearch
134
135 You then need to set the URL and authentication credentials if relevant.
136
137 > config :pleroma, Pleroma.Search.Elasticsearch.Cluster,
138 > url: "http://127.0.0.1:9200/",
139 > username: "elastic",
140 > password: "changeme",
141
142 ### Initial indexing
143
144 After setting up the configuration, you'll want to index all of your already existsing posts. You'll only have to do it one time, but it might take a while, depending on the amount of posts your instance has seen.
145
146 The sequence of actions is as follows:
147
148 1. First, change the configuration to use `Pleroma.Search.Elasticsearch` as the search backend
149 2. Restart your instance, at this point it can be used while the search indexing is running, though search won't return anything
150 3. Start the initial indexing process (as described below with `index`),
151 and wait until the task says it sent everything from the database to index
152 4. Wait until the index tasks exits
153
154 To start the initial indexing, run the `build` command:
155
156 === "OTP"
157 ```sh
158 ./bin/pleroma_ctl search import activities
159 ```
160
161 === "From Source"
162 ```sh
163 mix pleroma.search import activities
164 ```