This is an opinionated guide for how to set up IT infrastructure for a new lab. It assumes that you have at least some computing background, though it should be possible to follow along without one if you do a bit of research whenever you encounter something you don’t understand.
I strongly recommend going with AWS over other providers like GoDaddy or Namecheap since they’re a multi-billion dollar business that won’t be going anywhere for decades. Moving domains between providers is possible, but annoying. AWS also has a reasonable API in case you want to do more advanced things in the future, like programmatically updating entries.
I recently set up a new website for the Manglik lab here, which is generated by this Github repository.
There is a single source of truth for lab members, _data/authors.yml
, which is used both to generate the members page and set blog post authorship.
To avoid any repetition, the publication list is generated from blog posts where the front matter contains publication: true
, making it possible to both have standard blog posts and ones that announce publications while simultaneously creating an entry on the publications list.
See here for an example.
I recommend starting with the Manglik lab website as a template, and editing the contents as appropriate since it’s much cleaner than the repository that generates the Fraser lab webite; you’ll only need to edit the following:
_data/authors.yml
to include your lab members_pages/about.md
to include your contact info_pages/members.md
to edit the “Joining” section_pages/publications.md
to edit the Pubmed link to your own nameresearch_/
to set your research interestsassets/images/
(but not assets/css/
)CNAME
to match the URL of your websiteREADME.md
_config.yml
to set the site name, PI info, and site description_posts/
, which generates both the blog posts and publications list as described aboveTheoretically, Github limits these pages to less than 1 GB (which you hit surprisingly quickly once you start adding article PDFs or high-res images), but I don’t think they enforce it. Ideally, you’ll want to host anything over a couple MB separately, but that’s kind of a pain. Generally, Github and large files don’t play friendly since Git maintains an append-only history which begins to add up when you’re adding and removing files.
Consider registering for an email service on your domain so that you’re not tied to your university’s email infrastructure. It will be hard to find one that less than $5 per user per month, which will add up quickly. Most email services don’t support archiving accounts, so you’ll be paying that amount forever unless you’re okay with deleting everything.
I personally like Fastmail, which has a nice, snappy user interface, excellent support, and a free 30 day trial. Its family plan supports up to 6 users for a flat $11 per month if paid yearly (discounted if you subscribe for longer). They also pro-rate unused subscriptions, so when you exceed 6 users you can transition to a business plan that scales to an unlimited number of users at $5 per user per month without wasting money.
Topicbox is an email-based service that has inboxes designed to be shared. It’s $15 per month for up to 50 users and for any number of virtual addresses, and there’s a three month free demo. I recommend this since you can create one virtual inbox per vendor or group of people.
For example:
There’s a web interface that shows all the emails received for each virtual address. Additionally, you can set it up so that users can subscribe to any subset of the various virtual inboxes and automatically receive a copy of any email received by those addresses.
You’ll need to follow the directions here to use your own domain instead of a topicbox.com domain. You can’t easily host both the user emails mentioned above and the shared virtual addresses on the same domain due to limitations on how email routing works, so I suggest hosting it on a subdomain (e.g., box.example.org) while your primary email is hosted on the main domain.
Lab wikis are great for storing general lab info, like an onboarding guide. I really like Wiki.js, since you can set it up to sync with git; this allows you to update the wiki similarly to how you update your website in addition to the built-in editor. The wiki files are all plain-text which means that it’s reasonably browsable through Github in case Wiki.js ever stops being developed and easy to port to a different wiki engine (e.g., docuwiki or mediawiki, which powers Wikipedia) if you ever want to. You’ll have to host it yourself, which can be done either on DigitalOcean using an image pre-configured by the Wiki.js developers following this official guide or on AWS by following this community guide.
Right now, I’m working on setting up a pipeline that will enable us to take a recording from a meeting, convert it to text using a speech-to-text model, label each sentence by speaker, summarize it a couple paragraphs using the open-source Mixtral 8x7B, then upload it to the fully-searchable wiki by creating a git commit that’s pushed to Github and synced to the wiki, all without human intervention. If you’re a member of the Fraser lab, you can check this out for an example.
Uhh… Welcome to the land of only bad options, approximately ordered from least bad to terrible: