pwbm - Personal WayBack Machine
The goal of pwbm is to make an easy to use appliance which can be fed URLs which it scrapes periodically. The content is saved in a similar manner to the popular "Wayback machine". However as this is a 'personal' wayback machine, you control the URLs which are scanned, and when. The archive is held locally and can be easily managed.
Note: Unlike the "real" wayback machine, pwbm
does not seek to crawl the entire web, nor does it spider entire websites. It only archives specific URLs given to it. This is by design.
Installation
pwbm
is available as a snap in the Snap Store. The snap bundles everything needed to function, including monolith
. Installation on Linux is as follows:
snap install pwbm
Note: due to the unfinished nature of pwbm
, it's currently only available in the edge
channel.
Alternatively just clone the repo and run the shell script. You'll also need monolith
.
Usage
Adding URLs
Simply run pwbm
with a URL you'd like it to archive. This does not currently initiate a snapshot of that page.
pwbm https://ubuntu.com/
Gathering page snapshots
Run pwbm
to start a snapshot of every page.
pwbm
Results are stored in $SNAP_USER_COMMON/archive
if instaled from a snap, or ./archive
if run outside of a snap.
How it works
It's super basic. pwbm
just iterates through a list of URLs in a file, spawning monolith
and saving the results in a datestamped file in a folder specific to the host and path.
```
$ tree ~/snap/pwbm/common/archive/ /home/alan/snap/pwbm/common/archive/ └── ubuntu.com └── 2020-01-18T13:32:39+00:00-index.html
1 directory, 1 file
```
Viewing results
Browse the files in the archive/
folder and open them in a browse to view.
A convenience webserver has been added. It can be launched as follows, and presents the archive directory on port 8076.
pwbm.server
Visit http://localhost:8076/
to view the snapshots.
Thank you for your report. Information you provided will help us investigate further.
There was an error while sending your report. Please try again later.
You are about to open
Do you wish to proceed?
Snaps are applications packaged with all their dependencies to run on all popular Linux distributions from a single build. They update automatically and roll back gracefully.
Snaps are discoverable and installable from the Snap Store, an app store with an audience of millions.
Snap can be installed from the command line on openSUSE Leap 15.x and Tumbleweed.
You need first add the snappy repository from the terminal. Leap 15.5 users, for example, can do this with the following command:
sudo zypper addrepo --refresh https://download.opensuse.org/repositories/system:/snappy/openSUSE_Leap_15.5 snappy
Swap out openSUSE_Leap_15.5
for openSUSE_Leap_15.4
or openSUSE_Tumbleweed
if you’re using a different version of openSUSE.
With the repository added, import its GPG key:
sudo zypper --gpg-auto-import-keys refresh
Finally, upgrade the package cache to include the new snappy repository:
sudo zypper dup --from snappy
Snap can now be installed with the following:
sudo zypper install snapd
You then need to either reboot, logout/login or source /etc/profile
to have /snap/bin added to PATH.
Additionally, enable and start both the snapd and the snapd.apparmor services with the following commands:
sudo systemctl enable --now snapd
sudo systemctl enable --now snapd.apparmor
To install Personal WayBack Machine, simply use the following command:
sudo snap install pwbm
Browse and find snaps from the convenience of your desktop using the snap store snap.
Interested to find out more about snaps? Want to publish your own application? Visit snapcraft.io now.
Get to know Canonical, the company behind the products.
The world's favourite Linux OS for servers, desktops and IoT.
One subscription for security maintenance, support, FIPS and other compliance certifications.
The app store for Linux: secure packages and ultra-reliable updates.
A pure-container hypervisor. Run system containers and VMs at scale.
Build a bare metal cloud with super fast server provisioning.
Upgrades, maintenance, support, and fully managed options for long-term, low-cost infra.
Software-defined storage that lowers your total cost of ownership.
App portability for K8s on VMware, Amazon, Azure, Google, Oracle, IBM and bare metal.
Deploy, integrate and manage applications at any scale, on any infrastructure.
Stream Android applications to any device.
The software collaboration platform behind Ubuntu.
Optimised Ubuntu for public clouds.
Spin up Ubuntu VMs on Windows, Mac and Linux.
Control and customise your cloud instances.
Systems management and security patching for Ubuntu.
Simplify and standardise complex network configuration.
AI and MLOps at any scale, on any cloud.
Deploy a fully functional cloud in minutes.