Installation

PeARS Federated is a version of PeARS for federated use. Admins create PeARS instances that users can join to contribute to the index.

PeARS Federated is provided as-is. Before you use it, please check the rules of your country on crawling Web content and displaying snippets. And be a good netizen: do not overload people’s servers while indexing!

PeARS can be installed locally from source for development, testing, or even to manage one’s own personal search service. For public use (i.e. to share your search service with the world), we recommend the docker install. For those who would like to share their search index but are not fully comfortable with running their own server, we also have bespoke installation instructions for the PythonAnywhere cloud service, which provides an easy-to-user, visual interface to set up one’s own Web app.

Install from source

1. Clone the repo on your machine

git clone https://github.com/PeARSearch/PeARS-federated.git

2. (Optional step) Setup a virtualenv in your directory

If you haven’t yet set up virtualenv on your machine, please install it via pip:

sudo apt-get update
sudo apt-get install python3-setuptools
sudo apt-get install python3-pip
sudo apt install python3-virtualenv

Then change into the PeARS-federated directory:

cd PeARS-federated

Then run:

virtualenv env && source env/bin/activate

3. Install the build dependencies

From the PeARS-federated directory, run:

pip install -r requirements.txt

4. (Optional step) Install further languages

If you want to search and index in several languages at the same time, you can add multilingual support to your English install. To do this:

flask pears install-language lc

where you should replace lc with a language code of your choice. For now, we are only supporting English (en), German (de), French (fr) and Malayalam (ml) but more languages are coming!

5. Set up your .env

There is a .env template file at .env-template in the root directory of the repository. You should copy it to .env and fill in the information for your setup. If you need help setting the environment variables, check out our extra instructions here.

6. Run your pear!

While on your local machine, in the root of the repo, run:

python3 run.py

Now, go to your browser at localhost:8080. You should see the search page for PeARS. You don’t have any pages indexed yet, so start indexing to get you going!

Install from docker

These instructions assume that you are running your own server and that you have a domain name available for your PeARS instance.

1. Deploy Docker and Docker Compose

The following instructions are for Ubuntu. For other distributions, refer to the official Docker documentation.

  • SSH into your server.

  • Install necessary packages and Docker:

sudo apt-get update
sudo apt-get install -y ca-certificates curl gettext vim
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

2. Deploy PeARS-federated

  • SSH into your server

  • Set the domain name and instance specific directory name

export DOMAIN=pears-instance-url.com # Provide the URL on which you want to reach your pears-federated instance
export PEARS_DIR=~/pears-instance-name-1 # replace `pears-instance-name-1` with the name of your instance for ease of identification
export STAGE=production # replace this with `staging` if you are just testing the setup, otherwise it will create a TLS certificate for you

Download the Docker-compose file and setup base directory for your instance

Download the docker-compose.yml from the Github repository to the base of your server:

wget https://raw.githubusercontent.com/PeARSearch/PeARS-federated/nvn/add-deploy-files/deployment/docker-compose.yaml -O template.yaml

Use the above variables in the docker-compose file

envsubst < template.yaml > docker-compose.yaml
rm -rf template.yaml

Create a directory to store your instance details and to store persistent data for the instance:

mkdir -p ${PEARS_DIR}/data

Configure the environmental details for your instance

Download the env-template files from the GitHub repository:

wget https://raw.githubusercontent.com/PeARSearch/PeARS-federated/nvn/add-deploy-files/deployment/.env-template -O ${PEARS_DIR}/.env

Update the values in the .env file to match your configuration (follow the instructions in the .env file to fill in the data):

vim ${PEARS_DIR}/.env

Bring Up the Docker Compose

Note

This command assumes that you are running this command from the directory in which the docker-compose.yaml file exists

docker compose up -d

Point your DNS to the IP address of the server

Make sure you create an A name record pointing from your PeARS URL to the public IP address of the server

3. (Optional) Adding more instances to the same server

If you want to host several PeARS instances on the same server, we will have to re-use the same docker-compose file by adding new pod configurations and re-using the https-portal container that you will find in the docker-compose file to point to different instances for different domain names. Here are the step by step details for doing that:

Note

We assume you have already followed the above steps and have a single instance running already at this point.

Create a new directory for the new instance and download the environment variable file

export PEARS_DIR_2=~/pears-instance-name-2 # replace pears-instance-name-2 with your new instance name
mkdir -p ${PEARS_DIR_2}/data
# You can also copy this file from your existing instance directory for ease of editing
wget https://raw.githubusercontent.com/PeARSearch/PeARS-federated/nvn/add-deploy-files/deployment/.env-template -O ${PEARS_DIR_2}/.env

Change the environment details in the .env file:

vim ${PEARS_DIR_2}/.env

Update the docker-compose to also bring up the second instance. If you open your docker-compose.yaml file in the server at this point, you will find something like this:

version: '3.8'

services:
    pears-federated:
        env_file:
        - pears-instance-name-1/.env
        image: pearsproject/pears-federated:latest
        volumes:
        - pears-instance-name-1/data/:/var/lib/pears/data

    https-portal:
        image: steveltn/https-portal:1
        environment:
        DOMAINS: 'pears-instance-url.com -> http://pears-federated:8000'
        STAGE: production
        ports:
        - "80:80"
        - "443:443"
        depends_on:
        - pears-federated
        volumes:
        - https-portal-data:/var/lib/https-portal

To add another instance, you will have to first copy the pears-federated container definition to a new definition in the file with appropriate names as follows:

version: '3.8'

services:
    pears-federated: # if you want you can also rename this to have a more identifiable name
        env_file:
        - pears-instance-name-1/.env
        image: pearsproject/pears-federated:latest
        volumes:
        - pears-instance-name-1/data/:/var/lib/pears/data

    pears-federated-instance-2: # !! CHANGE rename this to have a more identifiable suffix
        env_file:
        - pears-instance-name-2/.env # !! CHANGE point to your new directory pears-instance-name-2
        image: pearsproject/pears-federated:latest
        volumes:
        - pears-instance-name-2/data/:/var/lib/pears/data # !! CHANGE point to your new directory pears-instance-name-2
    ...

Update https-portal pod to point to the new instance as well

version: '3.8'

services:
    pears-federated:
        env_file:
        - pears-instance-name-1/.env
        image: pearsproject/pears-federated:latest
        volumes:
        - pears-instance-name-1/data/:/var/lib/pears/data

    pears-federated-instance-2:
        env_file:
        - pears-instance-name-2/.env
        image: pearsproject/pears-federated:latest
        volumes:
        - pears-instance-name-2/data/:/var/lib/pears/data

    https-portal:
        image: steveltn/https-portal:1
        environment:
            # !! CHANGE: point the URL you want to point to your new instance to the http://<name-of-the-new-instance-in-this-file>:8000
            # You use a comma to separate the entries; this can support any number of mappings
            DOMAINS: 'pears-instance-url.com -> http://pears-federated:8000, pears-instance-2-url.com -> http://pears-federated-instance-2:8000'
            STAGE: production
        ports:
        - "80:80"
        - "443:443"
        depends_on:
        - pears-federated
        - pears-federated-instance-2 # !! CHANGE: notice that it is not depending on the new instance as well
        volumes:
        - https-portal-data:/var/lib/https-portal
```

Bring Up the Docker Compose

Note

This command assumes that you are running this command from the directory in which the docker-compose.yaml file exists

Start the Docker Compose services:

docker compose up -d

Check the new instance is running by running the command:

docker ps

Point your DNS to the IP address of the server

Make sure you create an A name record pointing from your new PeARS URL to the public IP address of the server

If you want to add a third instance, you can follow the same steps as above but for a third entry.

4. Management

To avoid loss of data, regularly back up the data folder:

Create a backup directory:

mkdir -p ~/pears-federated-backups

Copy the data directory to the backup directory:

cp -r ~/pears-instance-name-1/data ~/pears-federated-backups/data_backup_$(date +%Y%m%d%H%M%S)

Regularly schedule this backup process using a cron job or other automation tools to ensure your data is safe. You can setup configurations to upload these directory to a remote cloud storage for maximum security.