Software installation

Software installation


Updated 2021-02-25: updated for Influxdb 2
Updated 2021-06-17: updated PiHole backup strategy

This post contains information on how to set-up some common services/applications on the Raspberry Pi (running Arch Linux ARM). Some applications/settings might not be relevant for your use-case.

This section contains some background and can be skipped entirely.

There are multiple ways of running/installing applications on the Raspberry Pi. The default is through the package manager bundled with your Linux distribution, for Arch Linux that is pacman. A package manger is a set of tools which can be used to install, remove and upgrade ‘packages’ (applications/libraries/…). It is an easy way to keep all software on your system, including your Linux distribution, up-to-date. This is what i will be using. Advantages are: easy to use, minimal overhead and limited manual intervention. For popular software there are typically packages available already which make installing as simple as running a single command. The downside, if a common package breaks it might affect a lot of other applications/libraries on your system. By default applications/libraries are not segregated (also depends on the application/library deployment itself).

Containers are another way to deploy your applications. They provide more control and isolation between applications and the host system. With containers one can control which hardware and kernel interfaces are available to the application. A “container” may contain one or multiple applications. Containers start from an image which you will have to create or download from somebody else. Typically there are already images available for most popular applications. Using images allows one to more easily share application installations/configurations between systems. While the improved isolation and control are a plus it also adds some minor performance overhead and i deemed it overkill for my use-case. In addition you will most likely spend additional time on container configuration if you want to fine tune the isolation. Currently the most popular container implementation is Docker.

3rd party package management systems or software deployment tools can also be used. Popular ones are Flatpak and Snap. They come with their own feature sets and (dis)advantages. For example Flatpak also sandboxes applications to provide isolation. I decided to stick to the regular package manager but depending on your use-case the features of the 3rd party package managers might make it worthwhile to consider them.

Compilation from source, is the most hands-on approach and will not get you automatic updates but is an option never the less. For a set-and-forget system i would not recommend this though, however if there is no package/image/… available for the application/library you need it might be your only option.

Which option makes most sense for you depends on your use-case; i general the package manager from your Linux distribution is a safe default bet.

Gitea

Gitea is a lightweight code hosting solution similar to GitHub and GitLab. It comes with a web interface, similar to GitHub, can be self-hosted and is free.

It supports different database back-ends: PostgreSQL, MySQL are the best supported but there are other options as well. I will use PostgreSQL, for no particular reason. Install the postgresql package and finish the installation guide: sudo -iu postgres (add the user), initdb -D /var/lib/postgres/data (initialize the database cluster) and enable (+ start) the service postgresql.service.

Install gitea, finish the set-up on the official Arch wiki (or the official guide): createuser gitea (create the user), createdb -O gitea gitea (create the database), in /var/lib/postgres/data/pg_hba.conf add a new line: local gitea gitea peer and restart postgresql.service.

When Gitea and PostgreSQL are on the same machine, you should use a Unix socket, as it is faster and more secure. I will not configure Gitea to be accessible from the internet.

Finish configuring gitea by editing /etc/gitea/app.ini and at least set:

[database]
DB_TYPE             = postgres
HOST                = /run/postgresql/
NAME                = gitea
USER                = gitea
PASSWD              =
for the other available options see the Gitea’s cheat-sheet. I advice to at least configure: email setting, root url (static IP of the Raspberry Pi), disable 3rd party servics like: gravatar, openid sign in and open id self registration. Most other settings have proper defaults, if you want you can set-up SSL.

When you’re done start and enable gitea.service. If your Raspberry Pi does not have a RTC make sure to add a time-sync.target drop in file to make sure the date is correct before Gitea is started. Use: systemctl edit gitea and add:

[Unit]
After=time-sync.target
Wants=time-sync.target
For more information as to why this is required see my previous post.

Go to http://[Raspberry Pi IP]:3000/install to finish the installation. Afterwards you can access gitea on http://[Raspberry Pi IP]:3000.
Every time you make changes to /etc/gitea/app.ini you will have to restart the service for them to have effect.

Backup

Gitea has build-in backup and restore restore functionality. I advice to create a systemd service and timer to periodically take backups. You can use rsync, scp, ftp, mount, … to upload the backups to a remote location.

Gitea will create a zip file which by default uses /tmp hence is limited by the amount of free RAM on the Raspberry Pi. Depending on how much repositories you have, and/or how much RAM is in use, you can very quickly get out-of-memory errors. I strongly advice to use a different temp folder by using: --tempdir /my/temp/dir.

When using mount the backup script might look something like this (requires elevated privileges to run):

nfsServer=""
nfsServerShare=""
nfsMount="/mnt/AutoBackup"

sudo mkdir -p ${nfsMount}
sudo chmod 777 ${nfsMount}
sudo mount ${nfsServer}:${nfsServerShare} ${nfsMount}

giteaBackupFilename="gitea-dump.zip"
giteaTmpFolder="${nfsMount}/gitea/tmp" # Avoid hitting the disk with the downside of a performance hit

# Gitea uses /tmp to store temporary files, which is in RAM hence limited on the Pi so we use another /tmp folder
giteaOutput=$(sudo -u gitea gitea dump -c /etc/gitea/app.ini --tempdir ${giteaTmpFolder} --file ${nfsMount}/gitea/${giteaBackupFilename})
giteaStatus=$?

Note: the backup is executed as the gitea user due to the file permissions. See also my previous post about backups.

Restore

Restoring is a manual process as outlined in the documentation. The command to restore the postgres database is: sudo -u gitea psql -d gitea < gitea-db.sql.

Pi-hole

Pi-hole is a network level (DNS) advertisement and Internet tracker blocking application. It acts as your network’s DNS server (+ DHCP server if desired) and comes with a web interface. You have to host it yourself and it is free to use. Besides filtering (blocking) it can also server as a monitor tool.

Pi-hole is not bullet proof, DNS over HTTPS and other techniques will render Pi-hole ineffective. By default Pi-hole enables the canary domains for browsers to avoid them using DNS over HTTP.

Install pi-hole-server (AUR); when using yay: yay -S pi-hole-server.

Follow the official Arch wiki:

  1. To use the webinterface install: php-sqlite

  2. Update /etc/php/php.ini and enable: extension=pdo_sqlite extension=sockets extension=sqlite3

  3. Install lighttpd and php-cgi.

  4. Copy over Pi-hole’s config cp /usr/share/pihole/configs/lighttpd.example.conf /etc/lighttpd/lighttpd.conf and start (+ enable) the service lighttpd.service.

  5. To enable Pi-hole fetching block lists edit /etc/hosts and add

    127.0.0.1              localhost
    ip.address.of.pihole   pi.hole myhostname
    where ip.address.of.pihole is the static IP of your Raspberry Pi and myhostname its host name (use hostnamectl --static to get the hostname).

Just like with Gitea you will need to add a drop in file for the service. When you’re done start and enable pihole-FTL.service.

It might be that systemd-resolved.service already occupied port 53, which is required for pihole-FTL.service. To resolve this, stop and disable systemd-resolved.service and restart pihole-FTL.service.

If Pi-hole is not able to save settings look at journalctl, if it says http: Account expired or PAM config lacks an "account" section for sudo, contact upir suste, administrator reset the expiry date of http: sudo change --expiredate -1 http.

Pi-hole itself can be configured through its web interface, http://[IP Raspberry Pi]]/admin/ (or http://pi.hole/admin) the first thing you want to do is enable it.

To verify everything works as expected run pihole -g.

Arch linux is not an officially supported platform, so Pi-hole’s default debugging mechanism is disabled. If you ever need to trouble shoot this thread provides some useful methodologies.

Optional:

  • Optimize for solid state drives: edit /etc/pihole/pihole-FTL.conf and enable DBINTERVAL=60.0

  • If you want to use ‘Anonymous mode’ or ‘No Statistics mode’ the queries will still be logged (leaking information), to disable all logging execute Pi-hole logging off.
    Note: with logging disabled, the Pi-hole will loose all statistics on reboot in case a level higher or equal to ‘Anonymous mode’ is selected.

  • For performance improvements edit /etc/resolv.conf to let the Raspberry Pi use Pi-hole directly: nameserver 127.0.0.1.

When running chrony and Pi-hole on the same system, together with the /etc/resolv.conf change above, will make the time-sync.target hang indefinitely since chrony cannot resolve DNS for the NTP servers because Pi-hole is waiting on the service to finish (because of the drop in file). Hence a deadlock. To fix this specify the servers by IP instead of DNS name in /etc/chrony.conf. For example:

server 162.159.200.123 iburst
server 138.68.183.179 iburst
server 162.159.200.1 iburst
server 129.250.35.250 iburst

! pool 2.arch.pool.ntp.org iburst

One can look up NTP service IP addresses online.


Finally on your router point to the IP of the Raspberry Pi, running Pi-hole, as DNS server. I strongly advice to add a fallback DNS server, that way in case your Raspberry Pi goes down your network (DNS) will stay up.

Within Pi-hole you can configure which DNS servers will be used to resolve the queries, if you have performance problems i advice to switch them.

If you use a Synology NAS within your network with DDNS it might require additional attention.

Set-up

By default Pi-hole comes with a blocking list for trackers, it is up to your personal preference to add more lists (maintained by 3rd parties), create your own list, create your own rules (regex) and exceptions. Besides a blacklist one can also configure white lists.

Pi-hole can be used together with client side ad blockers, like uBlock Origin. Where the latter will prevent the DNS request to be send in the first place. Client side ad blocker have to advantage to be easier to enable/disable on the fly.

Pi-hole will periodically update the used lists so requires no manual intervention once set-up.

Backup

The web interface offers a backup functionality called “Teleporter” for the configuration (settings & lists); however not for the data. You may backup the FTL database (/etc/pihole/pihole-FTL.db), while Pi-hole is running, using SQLLite3 (as per documentation).

The databases are backed-up using cp and mounting a remote nfs share

nfsServer=""
nfsServerShare=""
nfsMount="/mnt/AutoBackup"

sudo mkdir -p ${nfsMount}
sudo chmod 777 ${nfsMount}
sudo mount ${nfsServer}:${nfsServerShare} ${nfsMount}

piholeDir="/etc/pihole"

# All configuration files in /etc are already backed-up by EtcKeepers 
# but the data bases are excluded because it bloats the git repository.
#
# Using cp instead of rsync because the home folder of the pihole user is / which complicates
# generating ssh keys.
piholeOutput=$(sudo cp ${piholeDir}/gravity.db ${nfsMount}/pihole)
piholeStatus=$?
piholeOutput+=$(sudo cp ${piholeDir}/pihole-FTL.db ${nfsMount}/pihole)
if [[ ${piholeStatus} -eq 0 ]]; then
piholeStatus=$?
fi

Make sure the pihole folder already exists at the destination, NFS share, and has the correct file permissions. Otherwise cp will silently fail (folder will not be created).

Other configuration files, stored in /etc/pihole, are backed-up using etckeeper as discussed in an earlier post. Make sure to exclude *.domains, *.sha1 file and the databases in /etc/.gitignore to reduce your etckeeper git repository size.

If you already have committed the databases with etckeeper removed them by

cd /etc
sudo git rm --cached pihole/gravity.db
sudo git rm --cached pihole/PiHole-FTL.db
etckeeper commit -m "Remove pihole databases from git"

InfluxDB

InfluxDB (2) is a database for time series data, install influxdb. It can be configured through: config.[yaml|yml"toml"json] in the current working directory (or environment, or influxd flags). Or one can specify the location by export INFLUXD_CONFIG_PATH=/path/to/custom/config/directory, I will use /etc/influxdb/config.toml; make sure the influxdb user has sufficient rights to read it.

At least enable the bind address to allow for backups, for further configuration options I refer to the, excellent, official documentation.

Just like with Gitea you will need to add a drop in file for the service. When you’re done start and enable influxdb.service.

Set-up

It comes with a web UI ([IP Raspberry Pi]:8086) or CLI, see the documentation. Follow the set-up process and create a telegraf bucket.

Since there might be quite a lot of request InfluxDB by default may flood the journalctl log, i advice to adjust the log level to “warn”.

Backup

Backing up InfluxDB is very similar to Gitea, the backup command is: influxd backup ${nfsMount}/influxdb) there are some more options as specified in the documentation. Note: influxdb uses the current date and timestamp in the backup filenames, depending on your set-up you might want to remove the old files to avoid using too much storage space over time.

Telegraf

Telegraf is a plugin-driven server agent for collecting and sending metrics and events from databases, systems, and IoT sensors to a database, often InfluxDB (other databases are supported as well).

Install telegraf or telegraf-bin, both (AUR), it will automatically create an InfluxDB database (callled telegraf by default) when it starts up for the first time. To specify what should be measured, some settings are on by default, configure /etc/telegraf/telegraf.conf. For all options see the official documentation.

Just like with Gitea you will need to add a drop in file for the service. When you’re done start (+ enable) telegraf.service.

Set-up

Through Influxdb generate a token for the appropriate bucket (in my case telegraf) and give it read/write access. Disable (comment) the [outputs.influxdb] section and uncomment the [outputs.influxdb_V2] section. Fill in urls, organization, bucket and token.

To limit the log file size i advices to only log error messages by enabling quiet = true in the config file.

Backup

You only need to backup /etc/telegraf/telegraf.conf which can be done through etckeeper as discussed in an earlier post.

Grafana

Grafana is a general purpose dashboard and graph composer running as a web application. It is free to use and self hosted. Install grafana-bin (AUR), for more info see the official Arch Linux wiki.

However if influxdb (data storage) and grafana (data visualization) run on the same machine one could opt to skip grafana and use influxdb’s native data visualization tool (web UI). In fact it is so similar to grafana the queries can be copied verbatim. Feature wise they are on par. This would eliminate running an additional service (grafana) on your system.

In the config file /etc/grafana/grafana.ini change the HTTP_port to 4000 (default is set to 3000 which conflicts with Gitea). Afterwards the web interface can be used to change more settings.

Just like with Gitea you will need to add a drop in file for the service. When you’re done start (+ enable) grafana.service.

Go to http://[Raspberry Pi IP]:4000/ and change the default admin password upon first logon.

Set-up

To set-up Grafana to use the InfluxDB follow the official documentation: log in and go to “Data Sources”. Select “Add new Data Source” and find InfluxDB under “Timeseries Databases”. Set the URL: http://127.0.0.1:8086. Select “Flux” as query data, fill in the “InfluxDB detail” (first generate a new access token in the influxdb if required).

Next up you will need to create dashboards to show the data, there are many community dashboards available or you can create your own.

Make sure to once in a while run the plug-in upgrade command and restart grafana: grafana-cli plugins update-all; this is not automatically done on grafana upgrades.

Backup

The database and plug-ins are stored in /var/lib/grafana/ which can be backed up through rsync as discussed in an earlier post. For example:

rsyncServer=""
rsyncDestination=""
rsyncUser=""
rsyncOpions="-aqrm --delete -e ssh"

grafanaDir="/var/lib/grafana/"

grafanaOutput=$(sudo -u grafana rsync ${rsyncOpions} --exclude '.ssh' ${grafanaDir} ${rsyncUser}@${rsyncServer}:${rsyncDestination}/grafana)
grafanaStatus=$?

The backup must be executed as the grafana user due to local file permissions, i advice to generate a new set of ssh keys (no pw) for grafana and add them to the remote host as well; cat ~/.ssh/id_rsa.pub+ manual copy to “Authorized_keys” on the destination instead of using ssh-copy-id. Because grafana’s user home is set to /var/lib/grafana/ ssh keys will be stored there by default, hence included in the backup, to exclude them from rsync use --exclude '.ssh'.

Noticed an error in this post? Corrections are appreciated.

© Nelis Oostens