GNU social Developer's handbook

The book in your hands was written to teach you the art of contributing to the GNU social codebase.

It starts by introducing the Modules system and architecture, then the plugin development process and finally the exciting internals of GNU social for those looking forward to making the most advanced contributions.

What you need to dive in

This is the recommended setup, for simplicity and integration with the provided tools.

OR

  • At least a webserver such as nginx and a DBMS such as postgresql.
  • Depending on what you want to do, you may want to setup a queues (and caching) system such as redis.

To learn how to set all of that up, you may refer to the System Administrator's handbook.

Understand that this book assumes prior programming knowledge, if that doesn't sound like you, refer to our study resources to get started.

The other documentations are equally relevant to a developer

The User one is intended to illustrate the various common use cases, possibilities regarding customization and introduce the existing functionalities.

The Administrator one explains the step by step of how to install and maintain a GNU social instance, be it as node of The Free Network or as an intranet social network in a company setting.

The Designer one is an in-depth overview of the design motifs, and the design language used. Useful for creating new plugins and help as a basis for new themes.

Tests

When what you're looking for (usage instructions of anything), refer to the tests directory. The unit tests are properly commented and are very extensive, they try to cover every use scenario.

Architecture

Core

The core tries to be minimal. The essence of it being various wrappers around Symfony. It provides:

Everything else uses most of this.

Modules

The GNU social Component-based architecture provides a clear distinction between what can not be changed (core), what is replaceable but must be always present (component), and what can be removed or added (plugin).

This architecture has terminology differences when compared to the one that was introduced in v2. In fact, back in v2 - as the term modules is not necessarily non-essential - we would keep a "modules" directory near "plugins", to make the intended difference between both self-evident.

Now in v3, Module is the name we give to the core system managing all the modules (as it is broad enough to include both components and plugins). N.B.: there are not modules in the same sense as there are components and plugins, the latter descend from the former.

Components

The most fundamental modules are the components. These are non-core functionality expected to be always available. Unlike the core, it can be exchanged with equivalent components.

We have components for two key reasons:

  • to make available internal higher level APIs, i.e. more abstract ways of interacting with the Core;
  • to implement all the basic/essential GNU social functionality in the very same way we would implement plugins.

Currently, GNU social has the following components:

  • Avatar
  • Posting

Design principles

  • Components are independent so do not interfere with each other;
  • Component implementations are hidden;
  • Communication is through well-defined events and interfaces (for models);
  • One component can be replaced by another if its events are maintained.

Plugins (Unix Tools Design Philosophy)

GNU social is true to the Unix-philosophy of small programs to do a small job.

  • Compact and concise input syntax, making full use of ASCII repertoire to minimise keystrokes;
  • Output format should be simple and easily usable as input for other programs;
  • Programs can be joined together in “pipes” and “scripts” to solve more complex problems;
  • Each tool originally performed a simple single function;
  • Prefer reusing existing tools with minor extension to rewriting a new tool from scratch;
  • The main user-interface software (“shell”) is a normal replaceable program without special privileges;
  • Support for automating routine tasks.

Brian W. Kernighan, Rob Pike: The Unix Programming Environment. Prentice-Hall, 1984.

For instructions on how to implement a plugin and use the core functionality check the Plugins chapter.

Dependencies

  • The Core only depends on Symfony. We wrote wrappers for all the Symfony functionality we use, making it possible to replace Symfony in the future if needed and to make it usable under our programming paradigms, philosophies and conventions. V2 tried to do this with PEAR.
  • Components only depend on the Core. The Core never depends on Components.
  • Components never have inter-dependencies.
  • Plugins can depend both on the Core and on Components.
  • A plugin may recognize other plugin existence and provide extra functionality via events.

N.B.: "depend on" and "allowing to" have different implications. A plugin can throw an event and other plugins may handle such event. On the other hand, it's wrong if:

  • two plugins are inter-dependent in order to provide all of their useful functionality - consider adding configuration to your plugin;
  • a component depends on or handles events from plugins - consider throwing an event from your component replacement and then handling it from a plugin.

This "hierarchy" makes the flow of things perceivable and predictable, that helps to maintain sanity.

GNU social Coding Style

Please comply with PSR-12 and the following standard when working on GNU social if you want your patches accepted and modules included in supported releases.

If you see code which doesn't comply with the below, please fix it :)

Programming Paradigms

GNU social is written with multiple programming paradigms in different places.

Most of GNU social code is procedural programming contained in functions whose name starts with on. Starting with "on" is making use of the Event dispatcher (onEventName). This allows for a declarative structure.

Hence, the most common function structure is the one in the following example:

public function onRainStart(array &$args): bool
{
    Util::openUmbrella();
    return true;
}

Things to note in the example above:

  • This function will be called when the event "RainStart" is dispatched, thus its declarative nature. More on that in the Events chapter.
  • We call a static function from a Util class. That's often how we use classes in GNU social. A notable exception being Entities. More on that in the Database chapter.

It's also common to have functional code snippets in the middle of otherwise entirely imperative blocks (e.g., for handling list manipulation). For this we often use the library Functional PHP.

Use of reflective programming, variable functions, and magic methods are sometimes employed in the core. These principles defy what is then adopted and recommended out of the core (components, plugins, etc.). The core is a lower level part of GNU social that carefully takes advantage of these resources. Unless contributing to the core, you most likely shouldn't use these.

PHP allows for a high level of code expression. In GNU social we have conventions for when each programming style should be adopted as well as methods for handling some common operations. Such an example is string parsing: We never chain various substring calls. We write a regex pattern and then call preg_match instead. All of this consistency highly contributes for a more readable and easier of maintaining code.

Strings

Use ' instead of " for strings, where substitutions aren't required. This is a performance issue, and prevents a lot of inconsistent coding styles. When using substitutions, use curly braces around your variables - like so:

$var = "my_var: {$my_var}";

Comments and Documentation

Comments go on the line ABOVE the code, NOT to the right of the code, unless it is very short. All functions and methods are to be documented using PhpDocumentor - https://docs.phpdoc.org/guides/

File Headers

File headers follow a consistent format, as such:

 // This file is part of GNU social - https://www.gnu.org/software/social
 //
 // GNU social is free software: you can redistribute it and/or modify
 // it under the terms of the GNU Affero General Public License as published by
 // the Free Software Foundation, either version 3 of the License, or
 // (at your option) any later version.
 //
 // GNU social is distributed in the hope that it will be useful,
 // but WITHOUT ANY WARRANTY; without even the implied warranty of
 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 // GNU Affero General Public License for more details.
 //
 // You should have received a copy of the GNU Affero General Public License
 // along with GNU social.  If not, see <http://www.gnu.org/licenses/>.

 /**
  * Description of this file.
  *
  * @package   samples
  * @author    Diogo Cordeiro <diogo@fc.up.pt>
  * @copyright 2019 Free Software Foundation, Inc http://www.fsf.org
  * @license   https://www.gnu.org/licenses/agpl.html GNU AGPL v3 or later
  */

Please use it.

A few notes:

  • The description of the file doesn't have to be exhaustive. Rather it's meant to be a short summary of what's in this file and what it does. Try to keep it to 1-5 lines. You can get more in-depth when documenting individual functions!

  • You'll probably see files with multiple authors, this is by design - many people contributed to GNU social or its forebears! If you are modifying an existing file, APPEND your own author line, and update the copyright year if needed. Do not replace existing ones.

Paragraph spacing

Where-ever possible, try to keep the lines to 80 characters. Don't sacrifice readability for it though - if it makes more sense to have it in one longer line, and it's more easily read that way, that's fine.

With assignments, avoid breaking them down into multiple lines unless neccesary, except for enumerations and arrays.

'If' statements format

Use switch statements where many else if's are going to be used. Switch/case is faster.

 if ($var === 'example') {
     echo 'This is only an example';
 } else {
     echo 'This is not a test.  This is the real thing';
 }

Do NOT make if statements like this:

 if ($var === 'example'){ echo 'An example'; }

OR this

 if ($var === 'example')
         echo "An {$var}";

Associative arrays

Always use [] instead of array(). Associative arrays must be written in the following manner:

 $array = [
     'var' => 'value',
     'var2' => 'value2'
 ];

Note that spaces are preferred around the '=>'.

A note about shorthands

Some short hands are evil:

  • Use the long format for <?php. Do NOT use <?.
  • Use the long format for <?php echo. Do NOT use <?=.

Naming conventions

Respect PSR-12 first.

  • Classes use PascalCase (e.g. MyClass).
  • Functions/Methods use camelCase (e.g. myFunction).
  • Variables use snake_case (e.g. my_variable).

A note on variable names, etc. It must be possible to understand what is meant without necessarily seeing it in context, because the code that calls something might not always make it clear.

So if you have something like:

 $notice->post($contents);

Well I can easily tell what you're doing there because the names are straight- forward and clear.

Something like this:

 foo->bar();

Is much less clear.

Also, wherever possible, avoid ambiguous terms. For example, don't use text as a term for a variable. Call back to "contents" above.

Arrays

Even though PSR-12 doesn't specifically specify rules for array formatting, it is in the spirit of it to have every array element on a new line like is done for function and class method arguments and condition expressions, if there is more than one element. In this case, even the last element should end on a comma, to ease later element addition.

 $foo = ['first' => 'unu'];
 $bar = [
     'first'  => 'once',
     'second' => 'twice',
     'third'  => 'thrice',
 ];

Comparisons

Always use symbol based comparison operators (&&, ||) instead of text based operators (and, or) in an "if" clause as they are evaluated in different order and at different speeds. This is will prevent any confusion or strange results.

Prefer using === instead of == when possible. Version 3 started with PHP 8, use strict typing whenever possible. Using strict comparisons takes good advantage of that.

Use English

All variables, classes, methods, functions and comments must be in English. Bad english is easier to work with than having to babelfish code to work out how it works.

Encoding

Files should be in UTF-8 encoding with UNIX line endings.

No ending tag

Files should not end with an ending php tag "?>". Any whitespace after the closing tag is sent to the browser and cause errors, so don't include them.

Nesting Functions

Avoid, if at all possible. When not possible, document the living daylights out of why you're nesting it. It's not always avoidable, but PHP has a lot of obscure problems that come up with using nested functions.

If you must use a nested function, be sure to have robust error-handling. This is a must and submissions including nested functions that do not have robust error handling will be rejected and you'll be asked to add it.

Scoping

Properly enforcing scope of functions is something many PHP programmers don't do, but should.

In general:

  • Variables unique to a class should be protected and use interfacing to change them. This allows for input validation and making sure we don't have injection, especially when something's exposed to the API, that any program can use, and not all of them are going to be be safe and trusted.

  • Variables not unique to a class should be validated prior to every call, which is why it's generally not a good idea to re-use stuff across classes unless there's significant performance gains to doing so.

  • Classes should protect functions that they do not want overriden, but they should avoid protecting the constructor and destructor and related helper functions as this prevents proper inheritance.

Typecasting

PHP is a soft-typed language, it falls to us developers to make sure that we are using the proper inputs. When possible, use explicit type casting. Where it isn't, you're going to have to make sure that you check all your inputs before you pass them.

All inputs should be cast as an explicit PHP type.

Not properly typecasting is a shooting offence. Soft types let programmers get away with a lot of lazy code, but lazy code is buggy code, and frankly, we don't want it in GNU social if it's going to be buggy.

Consistent exception handling

Consistency is key to good code to begin with, but it is especially important to be consistent with how we handle errors. GNU social has a variety of built- in exception classes. Use them, wherever it's possible and appropriate, and they will do the heavy lifting for you.

Additionally, ensure you clean up any and all records and variables that need cleanup in a function using try { } finally { } even if you do not plan on catching exceptions (why wouldn't you, though? That's silly.).

If you do not call an exception handler, you must, at a minimum, record errors to the log using Log::level(message).

Ensure all possible control flows of a function have exception handling and cleanup, where appropriate. Don't leave endpoints with unhandled exceptions. Try not to leave something in an error state if it's avoidable.

NULL, VOID and SET

When programming in PHP it's common having to represent the absence of value. A variable that wasn't initialized yet or a function that could not produce a value. On the latter, one could be tempted to throw an exception in these scenarios, but not always that kind of failure fits the panic/exception/crash category.

On the discussion of whether to use === null vs is_null(), the literature online is diverse and divided. We conducted an internal poll and the winner was is_null().

Some facts to consider:

  1. null is both a data type, and a value;
  2. As noted in PHP's documentation, the constant null forces a variable to be of type null;
  3. A variable with null value returns false in an isset() test, despite that, assigning a variable to NULL is not the same as unsetting it. To actually test whether a variable is set or not requires adopting different strategies per context (https://stackoverflow.com/a/18646568).
  4. The void return type doesn't return NULL, but if used as an expression, it evaluates to null.

Considering union types and what we use null to represent, we believe that our use of null is always akin to that of a Option type. Here's an example:

function sometimes_has_answer(): ?int
{
    return random_int(1, 100) < 50 ? 42 : null;
}

$answer = sometimes_has_answer();
if (!is_null($answer)) {
    echo "Hey, we've got an {$answer}!";
} else {
    echo 'Sorry, no value. Better luck next time!';
}

A non-void function, by definition, is expected to return a value. If it couldn't and didn't run on an exceptional scenario, then you should test in a different style from that of regular strict comparison. Hence, as you're testing whether a variable is of type null, then you should use is_null($var). Just as you normally would with an is_int($var) or is_countable($var).

About nullable types, we prefer that you use the shorthand ?T instead of the full form T|null as it suggests that you're considering the possibility of not having the value of a certain variable. This apparent intent is reinforced by the fact that NULL can not be a standalone type in PHP.

Tools

GNU social provides some tools to aid developers. Most of the more common ones can be accessed with the make command, for the corresponding Makefile. These tools range from giving you acess to a shell in the PHP docker container, a repl in the application environment, a PostgreSQL shell, running PHPStan, among others.

Pre Commit Hooks

A git pre-commit hook is provided, which gets installed when composer install is run. This hook is responsible for running multiple things that aid in maintaing code quality.

Specifically, we have setup:

  • PHP Code Style fixer
  • PHP Documenation Checker
  • PHPStan

These can be disabled for a given commit by prepending the command with one of SKIP_CS_FIX, SKIP_DOC_CHECK, SKIP_PHPSTAN or SKIP_ALL, to disable the corresponding step.

Example

SKIP_PHPSTAN=1 git commit

These should be used sparingly, but they're primarily useful for work in process commits and when rebasing.

Make

Check the Makefile for an up to date list of commands available, but notable ones include:

  • php-shell - A shell in the PHP docker container, where you can run commands from the bin folder (see below) or composer
  • php-repl - A REPL in the context of the application, allowing you to run function for testing. Note that this requires explicitly importing the needed classes with use Class; like in a regular file
  • psql-shell - A PostgreSQL shell, if you need to edit the database contents manually
  • database-force-schema-update - If some entity changed, run this to add/remove/update any missing fields
  • test - Run PHPUnit tests
  • phpstan - Run PHPStan

These can be run by using make php-repl, or the desired command

Bin

In addition, some useful scripts are provided in the bin/ directory. Typically, these should be run inside the Docker container. This can be done by running make php-shell

Specifically:

  • bin/console - The Symfony console
  • bin/generate_entity_fields - Update autogenerated entity code. Must be run after changing database definitions, i.e. when altering the schemaDef static method in a class that extends Entity

Check the Symfony Console reference for details on available commands.

In addition to those, we have added app:events, which lists the application events and their handlers and can be invoked as bin/console app:event

Exception handling

Exceptions are a control-flow mechanism. The motivation for this control-flow mechanism, was specifically separating error handling from non-error handling code. In the common case that error handling is very repetitive and bears little relevance to the main part of the logic.

The exception safety level adopted in GNU social is Strong, which makes use of commit or rollback semantics: Operations can fail, but failed operations are guaranteed to have no side effects, leaving the original values intact.

In GNU social, exceptions thrown by a function should be part of its declaration. They are part of the contract defined by its interface: This function does A, or fails because of B or C.

N.B.: An error or exception means that the function cannot achieve its advertised purpose. You should never base your logic around a function's exception behaviour.

PHP has concise ways to call a function that returns multiple values (arrays/lists) and I/O parameters so, do not be tempted to use Exceptions for the purpose. Exceptions are exceptional cases and not part of a regular flow.

Furthermore, do not use error codes, that's not how we handle errors in GNU social. E.g., if you return 42, 1337, 31337, etc. values instead of FileNotFoundException, that function can not be easily understood.

Why different exceptions?

What can your caller do when he receives an exception? It makes sense to have different exception classes, so the caller can decide whether to retry the same call, to use a different solution (e.g., use a fallback source instead), or quit.

Hierarchy

GNU social has two exception hierarchies:

  • Server exceptions: For the most part, the hierarchy beneath this class should be broad, not deep. You'll probably want to log these with a good level of detail.
  • Client exceptions: Used to inform the end user about the problem of his input. That means creating a user-friendly message. These will hardly be relevant to log.

Do not extend the PHP Exception class, always extend a derivative of one of these two root exception classes.

  • Exception (from PHP)
    • ClientException (Indicates a client request contains some sort of error. HTTP code 400.)

      • InvalidFormException (Invalid form submitted.)
      • NicknameException (Nickname empty exception.)
        • NicknameEmptyException
        • NicknameInvalidException
        • NicknameReservedException
        • NicknameTakenException
        • NicknameTooLongException
        • NicknameTooShortException
    • ServerException

      • DuplicateFoundException (Duplicate value found in DB when not expected.)
      • NoLoggedInUser (No user logged in.)
      • NotFoundException
      • TemporaryFileException (TemporaryFile errors.)
        • NoSuchFileException (No such file found.)
        • NoSuchNoteException (No such note.)

General recommendations

(Adapted from http://codebuild.blogspot.com/2012/01/15-best-practices-about-exception.html)

  • In the general case you want to keep your exceptions broad but not too broad. You only need to deepen it in situations where it is useful to the caller. For example, if there are five reasons a message might fail from client to server, you could define a ServerMessageFault exception, then define an exception class for each of those five errors beneath that. That way, the caller can just catch the superclass if it needs or wants to. Try to limit this to specific, reasonable cases.
  • Deal with errors/exceptions at the appropriate level. If lower in the call stack, awesome. Quite often, the most appropriate is a much higher level.
  • Don't manage logic with exceptions: If a control can be done with if-else statement clearly, don't use exceptions because it reduces readability and performance (e.g., null control, divide by zero control).
  • Exception names must be clear and meaningful, stating the causes of exception.
  • Catch specific exceptions instead of the top Exception class. This will bring additional performance, readability and more specific exception handling.
  • Try not to re-throw the exception because of the price. If re-throwing had been a must, re-throw the same exception instead of creating a new one. This will bring additional performance. You may add additional info in each layer to that exception.
  • Always clean up resources (opened files etc.) and perform this in "finally" blocks.
  • Don't absorb exceptions with no logging and operation. Ignoring exceptions will save that moment but will create a chaos for maintainability later.
  • Exception handling inside a loop is not recommended for most cases. Surround the loop with exception block instead.
  • Granularity is very important. One try block must exist for one basic operation. So don't put hundreds of lines in a try-catch statement.
  • Produce enough documentation for your exception definitions
  • Don't try to define all of your exception classes before they are actually used. You'll wind up re-doing most of it. When you encounter an error case while writing code, then decide how best to describe that error. Ideally, it should be expressed in the context of what the caller is trying to do.

Events and event handlers

Definitions (adapted from PSR-14)

  • Event - An Event is a message produced by an Emitter. Usually denoting a state change.
  • Listener - A Listener is any PHP callable that expects to be passed an Event. Zero or more Listeners may be passed the same Event. A Listener MAY enqueue some other asynchronous behavior if it so chooses.
  • Emitter - An Emitter is any arbitrary code that wishes to dispatch an Event. This is also known as the "calling code".
  • Dispatcher - The Dispatcher is given an Event by an Emitter. The Dispatcher is responsible for ensuring that the Event is passed to all relevant Listeners.

Pattern

We implement the Observer pattern using the Mediator pattern.

The key is that the emitter should not know what is listening to its events. The dispatcher avoids modules communicate directly but instead through a mediator. This helps the Single Responsibility principle by allowing communication to be offloaded to a class that just handles communication.

How does it work? The dispatcher, the central object of the event dispatcher system, notifies listeners of an event dispatched to it. Put another way: your code dispatches an event to the dispatcher, the dispatcher notifies all registered listeners for the event, and each listener does whatever it wants with the event.

Example 1: Adding elements to the core UI

An emitter in a core twig template

{% for block in handle_event('ViewAttachment' ~ attachment.getMimetypeMajor() | capitalize , {'attachment': attachment, 'thumbnail_parameters': thumbnail_parameters}) %}
    {{ block | raw }}
{% endfor %}

Listener

/**
 * Generates the view for attachments of type Image
 *
 * @param array $vars Input from the caller/emitter
 * @param array $res I/O parameter used to accumulate or return values from the listener to the emitter
 *
 * @return \EventResult true if not handled or if the handling should be accumulated with other listeners,
 *              false if handled well enough and no other listeners are needed
 */
public function onViewAttachmentImage(array $vars, array &$res): \EventResult
{
    $res[] = Formatting::twigRenderFile('imageEncoder/imageEncoderView.html.twig', ['attachment' => $vars['attachment'], 'thumbnail_parameters' => $vars['thumbnail_parameters']]);
    return Event::stop;
}

Some things to note about this example:

  • The parameters of the handler onViewAttachmentImage are defined by the emitter;
  • Every handler must return a bool stating what is specified in the example docblock.

Example 2: Informing the core about an handler

Event emitted in the core

Event::handle('ResizerAvailable', [&$event_map]);

Event lister in a plugin

/**
 * @param array $event_map output
 *
 * @return \EventResult event hook
 */
public function onResizerAvailable(array &$event_map): \EventResult
{
    $event_map['image'] = 'ResizeImagePath';
    return Event::next;
}

Example 3: Default action

An event can be emited to perform an action, but still have a fallback as such:

Event emitter

if (Event::handle('EventName', $args) !== Event::stop): \EventResult
{
    // Do default action, as no-one claimed authority on handling this event
}

Database

GNU social has to store a large collection of data for rapid search and retrieval.

GNU social can use different Relational Database Management Systems, namely PostgreSQL and MariaDB.

The storage is interfaced using an Object-Relational Mapper (ORM) paradigm. As the term ORM already hints at, this allows to simplify the translation between database rows and the PHP object model.

Transactions

An EntityManager and the underlying UnitOfWork employ a strategy called transactional write-behind that delays the execution of SQL statements in order to execute them in the most efficient way and at the end of a transaction so that all write locks are quickly released. You should see Doctrine as a tool to synchronize your in-memory objects with the database in well defined units of work. Work with your objects and modify them as usual. For the most part, Doctrine ORM already takes care of proper transaction demarcation for you: All the write operations (INSERT/UPDATE/DELETE) are queued until EntityManager#flush() is invoked which wraps all of these changes in a single transaction.

Declaring an Entity

<?php
namespace Plugin\Embed\Entity;

use App\Core\Entity;
use DateTimeInterface;
class AttachmentEmbed extends Entity
{
    // These tags are meant to be literally included and will be populated with the appropriate fields, setters and getters by `bin/generate_entity_fields`
    // {{{ Autocode
    // }}} Autocode
        
    
    public static function schemaDef()
    {
        return [
            'name'   => 'attachment_embed',
            'fields' => [
                'attachment_id' => ['type' => 'int', 'not null' => true, 'foreign key' => true, 'target' => 'Attachment.id', 'multiplicity' => 'one to one', 'description' => 'Embed for that URL/file'],
                'mimetype'      => ['type' => 'varchar', 'length' => 50, 'description' => 'mime type of resource'],
                'filename'      => ['type' => 'varchar', 'length' => 191, 'description' => 'file name of resource when available'],
                'media_url'     => ['type' => 'text', 'description' => 'URL for this Embed resource when applicable (photo, link)'],
                'modified'      => ['type' => 'timestamp', 'not null' => true, 'description' => 'date this record was modified'],
            ],
            'primary key'  => ['attachment_id'],
        ];
    }
}

Retrieving an entity

use App\Core\DB\DB;
use App\Util\Exception\NotFoundException;
use App\Util\Exception\DuplicateFoundException;

/// ...

try {
    $object = DB::findOneBy('attachment_embed', ['attachment_id' => $attachment->getId()]);
} catch (NotFoundException) {
    // Not found
} catch (DuplicateFoundException) {
    // Integrety compromised
}

Deleting an Entity

DB::delete($object);

Creating an Entity

$embed_data['attachment_id'] = $attachment->getId();
DB::persist(Entity\AttachmentEmbed::create($embed_data));
DB::flush();

Querying the database --------------------- When the ORM isn't powerful enough to satisfy your needs, you can resort to Doctrine Query Language, which is preferred and has been extended so you can use table names rather than class names, or Doctrine QueryBuilder.

Cache

In the Database chapter you've learned how GNU social allows you to store data in the database. Depending on your server's specification, the database can be a bottleneck. To mitigate that, you can make use of an in-memory data structure storage to cache previous database requests. Using it is a great way of making GNU social run quicker. GNU social supports many adapters to different storages.

Although different cache adapters provide different functionalities that could be nice to take advantage of, we had to limit our cache interface to the basic avaiable in all of them. I.e., store and delete operations.

Store

/**
 * Get the cached avatar file info associated with the given Actor id
 *
 * Returns the avatar file's hash, mimetype, title and path.
 * Ensures exactly one cached value exists
 */
public static function getAvatarFileInfo(int $gsactor_id): array
{
    try {
        $res = GSFile::error(NoAvatarException::class,
            $gsactor_id,
            Cache::get("avatar-file-info-{$gsactor_id}",
                function () use ($gsactor_id) {
                    return DB::dql('select f.file_hash, f.mimetype, f.title ' .
                        'from Component\Attachment\Entity\Attachment f ' .
                        'join App\Entity\Avatar a with f.id = a.attachment_id ' .
                        'where a.gsactor_id = :gsactor_id',
                        ['gsactor_id' => $gsactor_id]);
                }));
        $res['file_path'] = \App\Entity\Avatar::getFilePathStatic($res['file_hash']);
        return $res;
    } catch (Exception $e) {
        $filepath = INSTALLDIR . '/public/assets/default-avatar.svg';
        return ['file_path' => $filepath, 'mimetype' => 'image/svg+xml', 'title' => null];
    }
}

Delete

Cache::delete('avatar-file-info-' . $gsactor_id);

Routes and Controllers

Routes

When GNU social receives a request, it calls a controller to generate the response. The routing configuration defines which action to run for each incoming URL.

You create routes by handling the AddRoute event.

public function onAddRoute(RouteLoader $r)
{
    $r->connect('avatar', '/{gsactor_id<\d+>}/avatar/{size<full|big|medium|small>?full}',
                [Controller\Avatar::class, 'avatar_view']);
    $r->connect('settings_avatar', '/settings/avatar',
                [Controller\Avatar::class, 'settings_avatar']);
    return Event::next;
}

The magic goes on $r->connect(string $id, string $uri_path, $target, ?array $options = [], ?array $param_reqs = []). Here how it works:

  • id: a unique identifier for your route so that you can easily refer to it later, for instance when generating URLs;
  • uri_path: the url to be matched, can be static or have parameters. The variable parts are wrapped in {...} and they must have a unique name;
  • target: Can be an array [Class, Method to invoke] or a string with Class to __invoke;
  • param_reqs: You can either do ['parameter_name' => 'regex'] or write the requirement inline {parameter_name<regex>};
  • options['accept']: The Accept header values this route will match with;
  • options['format']: Response content-type;
  • options['conditions']: https://symfony.com/doc/current/routing.html#matching-expressions ;
  • options['template']: Render a twig template directly from the route.

Observations

  • The special parameter _format can be used to set the "request format" of the Request object. This is used for such things as setting the Content-Type of the response (e.g. a json format translates into a Content-Type of application/json). This does not override the options['format'] nor the HTTP Accept header information.
$r->connect(id: 'article_show', uri_path: '/articles/search.{format}',
    target: [ArticleController::class, 'search'],
    param_reqs: ['format' => 'html|xml']
);
  • An example of a suitable accept headers array would be:
$r->connect('json_test', '/json_only', [C\JSON::class, 'test'], options: [
    'accept' => [
        'application/ld+json; profile="https://www.w3.org/ns/activitystreams"',
        'application/activity+json',
        'application/json',
        'application/ld+json'
    ]]);

Controllers

A controller is a PHP function you create that reads information from the Request object and creates and returns a either a Response object or an array that merges with the route options array. The response could be an HTML page, JSON, XML, a file download, a redirect, a 404 error or anything else.

HTTP method

/**
* @param Request $request
* @param array $vars Twig Template vars and route options
*/
public function onGet(Request $request, array $vars): array|Response
{
    return 
}

Forms

public function settings_avatar(Request $request): array
{
    $form = Form::create([
        ['avatar', FileType::class,     ['label' => _m('Avatar'), 'help' => _m('You can upload your personal avatar. The maximum file size is 2MB.'), 'multiple' => false, 'required' => false]],
        ['remove', CheckboxType::class, ['label' => _m('Remove avatar'), 'help' => _m('Remove your avatar and use the default one'), 'required' => false, 'value' => false]],
        ['hidden', HiddenType::class,   []],
        ['save',   SubmitType::class,   ['label' => _m('Submit')]],
    ]);

    $form->handleRequest($request);

    if ($form->isSubmitted() && $form->isValid()) {
        $data       = $form->getData();
        $user       = Common::user();
        $gsactor_id = $user->getId();
        // Do things
    }

    return ['_template' => 'settings/avatar.html.twig', 'avatar' => $form->createView()];
}

Templates

GNU social uses the Twig template engine. When you handle a UI-related event, you add your own twig snippets either with App\Util\Formatting::twigRenderFile or App\Util\Formatting::twigRenderString.

Example

public function onAppendRightPanelBlock(array $vars, Request $request, array &$res): bool
{
    if ($vars['path'] == 'attachment_show') {
        $related_notes = DB::dql('select n from attachment_to_note an ' .
    'join note n with n.id = an.note_id ' .
    'where an.attachment_id = :attachment_id', ['attachment_id' => $vars['vars']['attachment_id']]);
        $related_tags = DB::dql('select distinct t.tag ' .
    'from attachment_to_note an join note_tag t with an.note_id = t.note_id ' .
    'where an.attachment_id = :attachment_id', ['attachment_id' => $vars['vars']['attachment_id']]);
        $res[] = Formatting::twigRenderFile('attachmentShowRelated/attachmentRelatedNotes.html.twig', ['related_notes' => $related_notes]);
        $res[] = Formatting::twigRenderFile('attachmentShowRelated/attachmentRelatedTags.html.twig', ['related_tags' => $related_tags]);
    }
    return Event::next;
}

Regarding using the Twig language, you can refer to Twig Documentation.

Internationalisation

Usage

Basic usage is made by calling App\Core\I18n\I18n::_m, it works like this:

// Both will return the string 'test string'
_m('test string');
_m('test {thing}', ['thing' => 'string']);

This function also supports ICU format, you can refer to ICU User Guide, for more details on how it works. Below you find some examples:

$apples = [1 => '1 apple', '# apples'];

_m($apples, ['count' => -42]); // -42 apples
_m($apples, ['count' => 0]); // 0 apples
_m($apples, ['count' => 1]); // 1 apple
_m($apples, ['count' => 2]); // 2 apples
_m($apples, ['count' => 42]); // 42 apples

$apples = [0 => 'no apples', 1 => '1 apple', '# apples'];
_m($apples, ['count' => 0]); // no apples
_m($apples, ['count' => 1]); // 1 apple
_m($apples, ['count' => 2]); // 2 apples
_m($apples, ['count' => 42]); // 42 apples

$pronouns = ['she' => 'her apple', 'he' => 'his apple', 'they' => 'their apple', 'someone\'s apple'];
_m($pronouns, ['pronoun' => 'she']); // her apple
_m($pronouns, ['pronoun' => 'he']); // his apple
_m($pronouns, ['pronoun' => 'they']); // their apple
_m($pronouns, ['pronoun' => 'unknown']); // someone's apple

$complex = [
    'she'   => [1 => 'her apple', 'her # apples'],
    'he'    => [1 => 'his apple', 'his # apples'],
    'their' => [1 => 'their apple', 'their # apples'],
];

_m($complex, ['pronoun' => 'she',  'count' => 1]); // her apple
_m($complex, ['pronoun' => 'he',   'count' => 1]); // his apple
_m($complex, ['pronoun' => 'she',  'count' => 2]); // her 2 apples
_m($complex, ['pronoun' => 'he',   'count' => 2]); // his 2 apples
_m($complex, ['pronoun' => 'she',  'count' => 42]); // her 42 apples
_m($complex, ['pronoun' => 'they', 'count' => 1]); // their apple
_m($complex, ['pronoun' => 'they', 'count' => 3]); // their 3 apples

Utilities

Some common needs regarding user internationalisation are to know his language and whether it should be handled Right to left:

$user_lang = $user->getLanguage();
App\Core\I18n\I18n::isRtl($user_lang);

Logging

GNU social comes with a minimalist logger class. In conformance with the twelve-factor app methodology, it sends messages starting from the WARNING level to stderr.

The minimal log level can be changed by setting the SHELL_VERBOSITY environment variable:

SHELL_VERBOSITY valueMinimum log level
-1ERROR
1NOTICE
2INFO
3DEBUG

Log Levels

GNU social supports the logging levels described by RFC 5424.

  • DEBUG (100): Detailed debug information.

  • INFO (200): Interesting events. Examples: User logs in, SQL logs.

  • NOTICE (250): Normal but significant events.

  • WARNING (300): Exceptional occurrences that are not errors. Examples: Use of deprecated APIs, poor use of an API, undesirable things that are not necessarily wrong.

  • ERROR (400): Runtime errors that do not require immediate action but should typically be logged and monitored.

  • CRITICAL (500): Critical conditions. Example: Application component unavailable, unexpected exception.

  • ALERT (550): Action must be taken immediately. Example: Entire website down, database unavailable, etc. This should trigger the SMS alerts and wake you up.

  • EMERGENCY (600): Emergency: system is unusable.

Using

Log::level(message: string, context: array);

  • The message MUST be a string or object implementing __toString().

  • The message MAY contain placeholders in the form: {foo} where foo will be replaced by the context data in key "foo".

  • The context array can contain arbitrary data. The only assumption that can be made by implementors is that if an Exception instance is given to produce a stack trace, it MUST be in a key named "exception".

Where Logs are Stored

By default, log entries are written to the var/log/dev.log file when you’re in the dev environment. In the prod environment, logs are written to var/log/prod.log, but only during a request where an error or high-priority log entry was made (i.e. Log::error() , Log::critical(), Log::alert() or Log::emergency()).

Example usage

Log::info('hello, world.');
// Using the logging context, allowing to pass an array of data along the record:
Log::info('Adding a new user', ['username' => 'Seldaek']);

Queue

Some activities that GNU social can do, like broadcasting with OStatus or ActivityPub, XMPP messages and SMS operations, can be 'queued' and done by asynchronous daemons instead.

Running Queues

Run the queue handler with:

php bin/console messenger:consume async --limit=10 --memory-limit=128M --time-limit=3600

GNU social uses Symfony, therefore the documentation on queues might be useful.

Definitions

  • Message - A Message holds the data to be handled (a variable) and the queue name (a string).
  • QueueHandler - A QueueHandler is an event listener that expects to receive data from the queue.
  • Enqueuer - An Enqueuer is any arbitrary code that wishes to send data to a queue.
  • Transporter - The Transporter is given a Message by an Enqueuer. The Transporter is responsible for ensuring that the Message is passed to all relevant QueueHandlers.

Using Queues

Queues are akin to events.

In your plugin you can call App\Core\Queue::enqueue and send a message to be handled by the queue:

Queue::enqueue($hello_world, 'MyFirstQueue');

and then receive with:

public function onMyFirstQueue($data): bool
{
    // Do something with $data
    return Event::next;
}

GNU social comes with a set of core queues with often wanted data: TODO Elaborate.

Attachments, Files, Thumbnails and Links

An attachment in GNU social can represent both a file or a link with a thumbnail.

Files

Storage

Not every information should be stored in the database. Large blobs of data usually find their space in storage. The two most common file abstractions you will find in GNU social are App\Util\TemporaryFile and Symfony\Component\HttpFoundation\File\UploadedFile.

The UploadedFile comes from Symfony and you'll find it when working with forms that have file upload as inputs. The TemporaryFile is how GNU social handles/represents any file that isn't in a permanent state, i.e., not yet ready to be moved to storage.

So, the Attachment entity won't store the information, only point to it.

Example

Here's how the ImageEncoder plugin creates a temporary file to manipulate an image in a transaction fashion before committing its changes:

// TemporaryFile handles deleting the file if some error occurs
$temp = new TemporaryFile(['prefix' => 'image', 'suffix' => $extension]);

$image  = Vips\Image::newFromFile($file->getRealPath(), ['access' => 'sequential']);
$width  = Common::clamp($image->width, 0, Common::config('attachments', 'max_width'));
$height = Common::clamp($image->height, 0, Common::config('attachments', 'max_height'));
$image  = $image->crop(0, 0, $width, $height);
$image->writeToFile($temp->getRealPath());

// Replace original file with the sanitized one
$temp->commit($file->getRealPath());

Note how we:

  1. created a temporary file $temp,
  2. then write the in-memory $image manipulation of $file to storage in $temp
  3. and only then commit the changes in $temp to $file's location.

If anything failed in 2 we would risk corrupting the input $file. In this case, for performance's sake, most of the manipulation happens in memory. But it's obvious that TemporaryFile can also be very useful for eventual in-storage manipulations.

Return a file via HTTP

Okay, it's fun that you can save files. But it isn't very useful if you can't show the amazing changes or files you generated to the client. For that, GNU social has App\Core\GSFile.

Example

public function avatar_view(Request $request, int $gsactor_id)
{
    $res = \Component\Avatar\Avatar::getAvatarFileInfo($gsactor_id);
    return \App\Core\GSFile::sendFile(filepath: $res['filepath'],
                                      mimetype: $res['mimetype'],
                                      output_filename: $res['title'],
                                      disposition: 'inline');
}

Simple enough.

Attachments: Storing a reference in database

Finally, you need a way to refer to previous files. GNU social calls that representation of Component\Attachment\Entity\Attachment. If a note refers to an Attachment then you can link them using the entity AttachmentToNote.

Important: The core hashes the files and reuses Attachments. Therefore, if you're deleting a file from storage, you must ensure it is really intended and safe.

Call the functions Attachment::validateAndStoreFileAsAttachment and Attachment::validateAndStoreURLAsAttachment.

Killing an attachment

Because deleting an attachment is different from deleting your regular entity, to delete an attachment you should call the member function kill(). It will decrease the lives count and only remove it if it has lost all its lives.

Thumbnails

Both files and links can have an AttachmentThumbnail. You can have an AttachmentThumbnail for every Attachment. You can only have an AttachmentThumbnail if you have an attachment first. Read a plugin such as ImageEncoder to understand how thumbnails can be generated from files. And StoreRemoteMedia to understand how to generate them from URLs.

The controller asking for them is the App\Controller\Attachment::attachment_thumbnail with a call to Component\Attachment\Entity\AttachmentThumbnail::getOrCreate().

Trade-offs between decoupling and complexity

This kind of questions are deepened in our wiki. Despite that, in this case it is relevant enough to walk a little through in the documentation. You'll note that the Attachment entity has fairly specific fields such as width and height. Maybe for an Attachment you could use the width field for the cover image of a song, or not and just leave it null. And for a song preview you could use width for duration and leave height as null. The point is, we could have the entities ImageAttachment and an ImageAttachmentThumbnail being created by the ImageEncoder plugin and move these specificities to the plugin. But the end code would require more database requests, become heavier, and become harder to read. And maybe we're wasting a bit more space (maybe!). But if that's the case, it's far from significant. The processing cost and ease of understanding outweighs the storage cost.

We have Links entities for representing links, these are used by the Posting component to represent remote urls. These are fairly similar to the attachment entities.

Security

Validate vs Sanitize

You're probably already familiar with the old saying "Never trust your users input", if not, you're now.

Sadly, that often worries developers so much that they will sanitize every single user input before storing it. That's, to our eyes, a bad practice. You shouldn't trust your users, but that should never lead you to break data integrity.

Instead of sanitize before store, you should validate if the input makes sense, and tell your client if it isn't.

Sanitize before spitting out

If a user inputs a string containing HTML tags, you shouldn't strip them out before storing. Depending on the context, you should sanitize it before outputting. For that you can call App\Core\Security::sanitize(string: $html), optionally you can send a second argument specifying tags to maintain array: ['tag'].

Generating a readable confirmation code

TODO

HTTP Client

It is sometimes necessary to perform HTTP requests. We simply have a static wrapper around Symfony HTTP.

You can do App\Core\HTTPClient::{METHOD}(string: $url, array: $options): ResponseInterface. The $options are elaborated in Symfony Doc.

Please note that the HTTP Client is lazy, which makes it very important to bear in mind Network Errors,

An example where this behaviour has to be considered:

if (Common::isValidHttpUrl($url)) {
    $head = HTTPClient::head($url);
    // This must come before getInfo given that Symfony HTTPClient is lazy (thus forcing curl exec)
    try {
        $headers = $head->getHeaders();
    } catch (ClientException $e) {
        throw new InvalidArgumentException(previous: $e);
    }
    $url      = $head->getInfo('url'); // The last effective url (after getHeaders, so it follows redirects)
    $url_hash = hash(self::URLHASH_ALGO, $url);
} else {
    throw new InvalidArgumentException();
}

What you can take from Responses is specified here.

Developing Modules

By now you should have already read on how to interact with GNU social's internals.

So now you want to include your own functionality. For that you can create a plugin or replace a component.

Location

  • Third party plugins are placed in local/plugins.
  • Third party components are placed in local/components.

Structure

The structure of a module is similar to that of the core. The tree is

local/plugins/Name
├── composer.json  : Local composer configuration for the module
├── config.yaml    : Default configuration for the module
├── locale         : Translation files for the module
├── Name.php       : Each plugin requires a main class to interact with the GNU social system
├── README.md      : A good plugin is a documented one :)
├── src            : Some plugins need more than the main class
│   ├── Controller
│   ├── Entity
│   └── Util
│       └── Exception : A sophisticated plugin may require some internal exceptions, these should extend GNUU social's own exceptions
├── templates
│   ├── Name       : In case the plugin adds visual elements to the UI
└── tests          : Just because it is a plugin, it doesn't mean it should be equally tested!

You don't need all of these directories or files. But you do have to follow this structure in order to have the core autoload your code.

To make a plugin, the file Name.php has to extend App\Core\Modules\Plugin.

To make a component, the file Name.php has to extend App\Core\Modules\Component.

As with components, some plugins have to follow special APIs in order to successfully provide certain functionality. Under src/Core/Modules you'll find some abstract classes that you can extend to implement it properly.

The main class

The plugin's main class handles events with onEventName and should implement the function version to inform the plugin's current version as well as handle the event onPluginVersion to add the basic metadata:

/**
 * @return string Current plugin version
*/
public function version(): string
{
    return '0.1.0';
}

/**
 * Event raised when GNU social polls the plugin for information about it.
 * Adds this plugin's version information to $versions array
 *
 * @param &$versions array inherited from parent
 *
 * @return bool true hook value
 */
public function onPluginVersion(array &$versions): bool
{
    $versions[] = [
        'name'        => 'PluginName',
        'version'     => $this->version(),
        'author'      => 'Author 1, Author 2',
        'homepage'    => 'https://gnudev.localhost/',
        'description' => // TRANS: Plugin description.
            _m('Describe this awesome plugin.'),
    ];
    return Event::next;
}

Adding configuration to a Module

The trade-off between re-usability and usability

The more general the interface, the greater the re-usability, but it is then more complex and hence less usable.

It is often good to find a compromise by means of configuration.

Module configuration

The default configuration is placed in local/plugins/Name/config.yaml.

parameters:
  name:
    setting: 42

A user can override this configuration in social.local.yaml with:

parameters:
  locals:
    name:
      setting: 1337

Note that if plugin's name is something like FirstSecond, it will become first_second in the configuration file.

Initialization

Plugins overload this method to do any initialization they need, like connecting to remote servers or creating paths or so on. @return bool hook value; true means continue processing, false means stop.

public function initialize(): bool
{
    return true;
}

Clean Up

Plugins overload this method to do any cleanup they need, like disconnecting from remote servers or deleting temp files or so on.

public function cleanup(): bool
{
    return true;
}

Debugging and Testing

Testing

GNU social isn't too big or hard to understand, but remember that events are asynchronous and every handler must take care not to interfere with core and other plugins normal behaviours.

We encourage Test-driven development, as it helps preventing regressions and unexpected behaviour.

To run GNU social's tests you can execute:

make tests

To write your own TestCase you can use App\Util\GNUsocialTestCase.

To mock HTTP requests you can $client = static::createClient();.

Finally, to use services such as queues you must parent::bootKernel();.

As the test framework we adopted PHPUnit, you have a list of possible assertions in PHPUnit Manual.

Debugging

Because we are using Symfony, we recall that a useful tool for debugging is Symfony's VarDumper component, as a more friendly alternative to PHP's var_dump and print_r.

There's also a PsySH REPL that you can access with bin/console psysh and experiment with direct calling of GNU social functions.

The Core

This documentation adopted a top-down approach. We believed this to be the most helpful as it reduces the time needed to start developing third party plugins. To contribute to GNU social's core, on the other hand, it's important to understand its flows and internals well.

The core tries to be minimal. The essence of it being various wrappers around Symfony. It is divided in:

Overview of GNU social's Core Internals

GNU social's execution begins at public/index.php, which gets called by the webserver for all requests. This is handled by the webserver itself, which translates a GET /foo to GET /index.php?p=foo. This feature is called 'fancy URLs', as it was in V2.

The index script handles all the initialization of the Symfony framework and social itself. It reads configuration from .env or any .env.*, as well as social.yaml and social.local.yaml files at the project root. The index script creates a Kernel object, which is defined in src/Kernel.php. This is the part where the code we control starts; the Kernel constructor creates the needed constants, sets the timezone to UTC and the string encoding to UTF8. The other functions in this class get called by the Symfony framework at the appropriate times. We will come back to this file.

Registering services

Next, the src/Util/GNUsocial.php class is instantiated by the Symfony framework, on the 'onKernelRequest' or 'onCommand' events. The former event, as described in the docs:

This event is dispatched very early in Symfony, before the controller is determined. It's useful to add information to the Request or return a Response early to stop the handling of the request.

The latter, is launched when the bin/console script is used.

In both cases, these events call the register function, which creates static references for the services such as logging, event and translation. This is done, so these services can be used via static function calls, which is much less verbose and more accessible than the way the framework recommends. This function also loads all the Components and Plugins, which like in V2, are modules that aren't directly connected to the core code, being used to implement internal and optional functionality respectively, by handling events launched by various parts of the code.

Database definitions

Going back to the Kernel, the build function gets called by the Symfony framework and allows us to register a 'Compiler Pass'. Specifically, we register App\DependencyInjection\Compiler\SchemaDefPass and App\DependencyInjection\Compiler\ModuleManagerPass. The former adds a new 'metadata driver' to Doctrine. The metadata driver is responsible for loading database definitions. We keep the same method as in V2, where each 'Entity' has a schemaDef static function which returns an array with the database definition. The latter handles the loading of modules (components and plugins).

This database definition is handled by the SchemaDefPass class, which extends Doctrine\Persistence\Mapping\Driver\StaticPHPDriver. The function loadMetadataForClass is called by the Symfony framework for each file in src/Entity/. It allows us to call the schemaDef function and translate the array definition to Doctrine's internal representation. The ModuleManagerPass later uses this class to load the entity definitions from each plugin.

Routing

Next, we'll look at the RouteLoader, defined in src/Core/Router/RouteLoader.php, which loads all the files from src/Routes/*.php and calls the static load method, which defines routes with an interface similar to V2's connect, except it requires an additional identifier as the first argument. This identifier is used, for instance, to generate URLs for each route. Each route connects a URL path to a Controller, with the possibility of taking arguments, which are passed to the __invoke method of the respective controller or the given method. The controllers are defined in src/Controller/ or plugins/*/Controller or components/*/Controller and are responsible for handling a request and return a Symfony Response object, or an array that gets converted to one (subject to change, in order to abstract HTML vs JSON output).

This array conversion is handled by App\Core\Controller, along with other aspects, such as firing events we use. It also handles responding with the appropriate requested format, such as HTML or JSON, with what the controller returned.

End to end

The next steps are handled by the Symfony framework which creates a Request object from the HTTP request, and then a corresponding Response is created by App\Core\Controller, which matches the appropriate route and thus calls its controller.

Performance

All of this happens on each request, which seems like a lot to handle, and would be too slow. Fortunately, Symfony has a 'compiler' which caches and optimizes the code paths. In production mode, this can be done through a command, while in development mode, it's handled on each request if the file changed, which has a performance impact, but obviously makes development easier. In addition, we cache all the module loading.