Merge branch 'v1.3-dev' as v1.3.11
[websub-hub] / README.md
1 # Welcome to my WebSub Hub
2
3 ## What
4
5 [WebSub](https://www.w3.org/TR/websub/) is a protocol for subscribing to content updates from publishers. The Hub is the central component which manages that relationship.
6
7 This Hub implementation was created with personal [self-hostable](https://indieweb.org/WebSub) deployment in mind. It is content-agnostic, supports multiple database backends, and can scale to multiple nodes for robustness and capacity.
8
9 ## Beware
10
11 This is currently a Minimum Viable Product release. Basic functionality is complete, but the administration experience may be challenging.
12
13 ## Up And Running
14
15 Customize configuration within `config/${NODE_ENV}.js`. All envs inherit settings from `default.js` if not specified. Environment is selected using the `NODE_ENV` value, defaulting to `development` if unset.
16
17 Database table initialization and schema version migrations are automated. Configure SQLite with a database file, or point PostgreSQL to a created database.
18
19 A user will need to be created in order to view the `/admin` pages; the `bin/authAddUser.js` script will do this.
20
21 The bundled logger spews JSON to stdout.
22
23 An IndieAuth profile may be used to view any topics associated with that profile.
24 ![IndieAuth view of topics](./documentation/media/topics-indieauth.png)
25 ### Quickstart Example
26
27 One way of deploying this server is behind nginx, with the pm2 package to manage the server process, and a local postgres database. Some details on this are presented here as a rough guide to any parts of this stack which may be unfamiliar.
28
29 - Have NodeJS 14-ish available.
30 - Have PostgreSQL available.
31 - Clone the server repository.
32 ```git clone https://git.squeep.com/websub-hub```
33 - Install the production dependencies.
34 ```cd websub-hub```
35 ```NODE_ENV=production npm i```
36 - Create a ```config/production.js``` configuration file. See ```config/default.js``` for available settings.
37 > <pre>
38 > 'use strict';
39 > // Minimum required configuration settings
40 > module.exports = {
41 > encryptionSecret: 'this is a secret passphrase, it is pretty important to be unguessable',
42 > dingus: {
43 > selfBaseUrl: 'https://hub.squeep.com/',
44 > },
45 > db: {
46 > connectionString: 'postgresql://websubhub:mypassword@localhost/websubhub',
47 > },
48 > };
49 > </pre>
50 - Prepare PostgreSQL with a user and database, using e.g. ```psql```.
51 > <pre>
52 > CREATE ROLE websubhub WITH CREATEDB LOGIN PASSWORD 'mypassword';
53 > GRANT websubhub TO postgres
54 > CREATE DATABASE websubhub OWNER=websubhub;
55 > GRANT ALL PRIVILEGES ON DATABASE websubhub TO websubhub;
56 > \c websubhub
57 > CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
58 > </pre>
59 - Install the process manager, system-wide.
60 ```npm i -g pm2```
61 - Configure the process manager to keep the server logs from growing unbounded.
62 ```pm2 install pm2-logrotate```
63 ```pm2 set pm2-logrotate:rotateInterval '0 0 1 * *'``` (rotate monthly)
64 ```pm2 set pm2-logrotate:compress true```
65 ```pm2 startup``` (arrange to start process monitor on system boot)
66 - Launch the server, running one process per CPU, and persist it through reboots.
67 ```NODE_ENV=production pm2 start server.js --name websubhub -i max```
68 ```pm2 save```
69 - Create an administration user.
70 ```NODE_ENV=production node bin/authUserAdd.js admin```
71 - Copy the static files to somewhere nginx will serve them from. This will vary greatly depending on your setup.
72 ```cp -rp static /home/websubhub/hub.squeep.com/html/static```
73 - Expose the server through nginx.
74 > <pre>
75 > server {
76 > listen 443 ssl http2;
77 > ssl_certificate /etc/ssl/nginx/server-chain.pem;
78 > ssl_certificate_key /etc/ssl/nginx/server.key;
79 > server_name hub.squeep.com;
80 > root /home/websubhub/hub.squeep.com/html
81 > try_files $uri $uri/ @websubhub;
82 >
83 > location @websubhub {
84 > proxy_pass http://websubhub$uri;
85 > proxy_set_header Host $host;
86 > proxy_set_header X-Forwarded-For $remote_addr;
87 > proxy_set_header X-Forwarded-Proto $scheme;
88 > proxy_http_version 1.1;
89 > }
90 >
91 > location = / {
92 > proxy_pass http://websubhub$is_args$args;
93 > proxy_set_header Host $host;
94 > proxy_set_header X-Forwarded-For $remote_addr;
95 > proxy_set_header X-Forwarded-Proto $scheme;
96 > proxy_http_version 1.1;
97 > }
98 > }
99 > </pre>
100 ```nginx -s reload```
101 - The Hub server should now be available!
102
103 ## Frills
104
105 A rudimentary tally of a topic's subscribers is available on the `/info?topic=topicUrl` endpoint. The topicUrl should be URI encoded. Formats available are SVG badge, JSON, and plain text, selectable by setting e.g. `format=svg` in the query parameters.
106
107 ## Architecture
108
109 The Hub keeps track of three primary entities:
110
111 - Topics: data and metadata for a published content endpoint. Topics are unique by source URL.
112 - Subscriptions: the relationship between a client requesting content and the topic providing it. Subscriptions are unique by topic and client URL.
113 - Verifications: updates to subscriptions which are pending confirmation. Verifications are not unique, but only the most recent for any Subscription pairing will be acted upon.
114
115 Any tasks in progress (notably: fetching new topic content, distributing that content to subscribers, or confirming pending verifications) are doled out and managed by a cooperative advisory locking mechanism. The task queue is wrangled in the database within the `*_in_progress` tables.
116
117 ![Entity relationship diagram for Postgres engine](./documentation/media/postgres-er.svg)
118
119 A Hub node will periodically check for more tasks to perform, executing them up to a set concurrency limit.
120
121 ### Quirks
122
123 This implementation is built atop an in-house API framework, for Reasons. It would not be hard to replace such with something more mainstream, but that is not currently a design goal.
124
125 ### File Tour
126
127 - bin/ - utility scripts
128 - config/
129 - default.js - defines all configuration parameters' default values
130 - index.js - merges an environment's values over defaults
131 - *.js - environment specific values, edit these as needed
132 - server.js - launches the application server
133 - src/
134 - common.js - utility functions
135 - communication.js - outgoing requests and associated logic
136 - db/
137 - base.js - abstract database class that any engine will implement
138 - errors.js - database Error types
139 - index.js - database factory
140 - schema-version-helper.js - schema migrations aide
141 - postgres/
142 - index.js - PostgreSQL implementation
143 - listener.js - notify/listen connection to support topic content caching
144 - sql/ - statements and schemas
145 - sqlite/
146 - index.js - SQLite implementation
147 - sql/ - statements and schemas
148 - enum.js - invariants
149 - errors.js - local Error types
150 - link-helper.js - processes Link headers
151 - logger/ - adds service-specific data filters to our logging module
152 - manager.js - process incoming requests
153 - service.js - defines incoming endpoints, linking the API server framework to the manager methods
154 - template/ - HTML content
155 - worker.js - maintains a pool of tasks in progress, for sending out updates, performing verifications, et cetera
156 - static/ - static assets
157 - test/ - unit and coverage tests
158 - test-e2e/ - support for whole-service testing