Go to file
Yadunand Prem 8b96c6dc43
feat: backup
2024-05-08 16:06:33 +08:00
ansible feat: backup 2024-05-08 16:06:33 +08:00
modules/headscale feat: add all the files 2024-02-04 09:09:07 +08:00
nomad feat: backup 2024-05-08 16:06:33 +08:00
v2 feat: backup 2024-05-08 16:06:33 +08:00
.ansible-env feat: add all the files 2024-02-04 09:09:07 +08:00
.env feat: add all the files 2024-02-04 09:09:07 +08:00
.gitignore feat: add all the files 2024-02-04 09:09:07 +08:00
.terraform.lock.hcl feat: add all the files 2024-02-04 09:09:07 +08:00
flake.lock feat: backup 2024-05-08 16:06:33 +08:00
flake.nix feat: backup 2024-05-08 16:06:33 +08:00
justfile feat: add all the files 2024-02-04 09:09:07 +08:00
main.tf feat: add all the files 2024-02-04 09:09:07 +08:00
Readme.org feat: backup 2024-05-08 16:06:33 +08:00
variables.tf feat: add all the files 2024-02-04 09:09:07 +08:00

My Infrastructure

Setting up the infrastructure

  1. To generate the infrastructure, run org-babel-tangle on the readme.
  2. Remove the op run parts from the justfile unless you are using 1password also.
  3. Replace the contents in the .env with your tokens / links to 1password resources

Documentation

There will be 2 datacenters. One is premhome, and the other one is the SGP1 (The digitalocean region name).

In SGP1, there will be 1 server, which will be the offsite server. This server will have the name Infranut-SGP1. This will be the first server we bootstrap, with

Setup

Use the ansible provider from github instead of the one on the registry since this is more updated

gh repo clone ansible/terraform-provider-ansible ~/dev/src/github.com/ansible/terraform-provider-ansible
cd ~/dev/src/github.com/ansible/terraform-provider-ansible
make

Use the overridden ansible

provider_installation {
  dev_overrides {
    "ansible/ansible" = "/Users/yadunut/dev/src/github.com/ansible/terraform-provider-ansible"
  }

  direct {}
}

Offsite Server

DONE Setup Environment variables and helpers

TF_VAR_do_token="op://Infrastructure/Digitalocean token/password"
TF_VAR_cf_token="op://Infrastructure/Cloudflare Token/credential"
TF_VAR_cf_zone="op://Infrastructure/Cloudflare Token/zone"
TF_VAR_ssh_public_key="op://Infrastructure/yadunut/public key"
TF_VAR_headscale_tls_email="op://Infrastructure/Headscale/tls_email"
TF_VAR_headscale_tls_email="op://Infrastructure/Headscale/tls_email"
TF_VAR_proxmox_servers="op://Infrastructure/Miscellaneous/proxmox_servers"
PM_API_URL="op://Infrastructure/terraform-prov/url"
PM_USER="op://Infrastructure/terraform-prov/username"
PM_PASS="op://Infrastructure/terraform-prov/password"
CONSUL_RAW_KEY="op://Infrastructure/Consul Gossip Key/password"
NOMAD_GOSSIP_KEY="op://Infrastructure/Nomad Gossip Key/password"
OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
ANSIBLE_HOST_KEY_CHECKING=false

I'm using justfile just to make running commands easier

set positional-arguments

default:
  @just --list

terraform *ARGS:
  op run --env-file=".env" --no-masking -- terraform {{ARGS}}

ansible-playbook *ARGS:
  op run --env-file=".ansible-env" --  ansible-playbook {{ARGS}}

Setup Terraform

terraform {
  required_providers {
    digitalocean = {
      source = "digitalocean/digitalocean"
      version = "2.32.0"
    }
    ansible = {
      source = "ansible/ansible"
      version = "1.1.0"
    }
    external = {
      source = "hashicorp/external"
      version = "2.3.2"
    }
    cloudflare = {
      source = "cloudflare/cloudflare"
      version = "4.20.0"
    }
    local = {
      source = "hashicorp/local"
      version = "2.4.1"
    }
    proxmox = {
      source = "Telmate/proxmox"
      version = "2.9.14"
    }
  }
}

provider "cloudflare" {
  api_token = var.cf_token
}

provider "digitalocean" {
  token = var.do_token
}

provider "proxmox" {}

Setup variables needed

variable do_token {
  type=string
  sensitive=true
}
variable cf_token {
  type=string
  sensitive=true
}
variable cf_zone { type=string }
variable ssh_public_key { type=string }
variable headscale_tls_email { type=string }

variable proxmox_servers { type=list(string) }
variable username {
  type=string
  default = "yadunut"
}

Import SSH Key

resource "digitalocean_ssh_key" "yadunut" {
  name = "yadunut"
  public_key = var.ssh_public_key
  lifecycle {
    prevent_destroy = true
  }
}

If you have keys already in digitalocean, then,

terraform import digitalocean_ssh_key.yadunut <id>

Spin up digital ocean server

Spin up a VM. On the ansible side, I'm giving it the username, that will eventually be used, but for the initial setup, I'm expecting to login via root to configure and setup the VMs

resource "digitalocean_droplet" "infranut_SGP1" {
  image  = "ubuntu-22-04-x64"
  name   = "infranut-SGP1"
  region = "SGP1"
  size   = "s-1vcpu-1gb"
  ssh_keys = [digitalocean_ssh_key.yadunut.id]
}

Assign domains to the server

Setup cloudflare on terraform and point an assigned domain to my servers

resource "cloudflare_record" "ts" {
  zone_id = var.cf_zone
  name = "ts"
  type = "A"
  value = digitalocean_droplet.infranut_SGP1.ipv4_address
  proxied = false
}

Setup server with ansible

Add ansible dependencies

---
collections:
  - name: cloud.terraform

roles:
  - name: geerlingguy.docker
  - src: https://github.com/ansible-community/ansible-consul.git
    name: ansible-consul
    scm: git
    version: master
---
plugin: cloud.terraform.terraform_provider
project_path: ../

Run the ansible playbook in terraform. I have no idea why this fails, and at this point, I'm giving up figuring out why. Updating ansible-provider to the latest one from github resolves this issue

resource "ansible_playbook" "setup_offsite" {
  playbook = "ansible/setup-offsite.yml"
  # replayable = false
  name = digitalocean_droplet.infranut_SGP1.ipv4_address
  replayable = false
  verbosity = 5
  extra_vars = {
    created_username = var.username
    ssh_key = "'${var.ssh_public_key}'"
    headscale_hostname = cloudflare_record.ts.hostname
    tls_email = var.headscale_tls_email
  }
}

Roles to run when setting up the ansible server.

The first play would run on first setup, and would not run on second tries. Since the server cannot be accessed via root user anymore, this play will not be able to connect to the given server

---
- hosts: all
  remote_user: "root"
  roles:
    - role: roles/do_setup
  ignore_unreachable: true

- hosts: all
  remote_user: "{{ created_username }}"
  become: true
  roles:
    - role: roles/common
    - role: roles/headscale

Useful initial setup for digital ocean ubuntu servers

---
- name: Setup passwordless sudo
  lineinfile:
    path: /etc/sudoers
    state: present
    regexp: '^%sudo'
    line: '%sudo ALL=(ALL) NOPASSWD: ALL'
    validate: '/usr/sbin/visudo -cf %s'
- name: Create user with sudo privilege
  user:
    name: "{{ created_username }}"
    state: present
    groups: sudo
    shell: /bin/bash
    append: true

- name: Set authorized key for remote user
  become: true
  authorized_key:
    user: "{{ created_username }}"
    manage_dir: true
    state: present
    key: "{{ ssh_key }}"

- name: Setup passwordless sudo
  lineinfile:
    path: /etc/ssh/sshd_config
    state: present
    regexp: '^PermitRootLogin'
    line: 'PermitRootLogin no'
    validate: 'sshd -t -f %s'

- name: Update apt and install packages
  retries: 3
  delay: 3
  apt:
    pkg:
      - curl
      - vim
      - git
    state: latest

Setup for almost any server. The common tasks of installing required dependencies and repositories. Also setting up a basic firewall with ufw

---
- name: Setup hashicorp repositories
  block:
    - apt_key:
        url: https://apt.releases.hashicorp.com/gpg
        state: present
    - apt_repository:
        repo: deb https://apt.releases.hashicorp.com jammy main
        state: present

- name: Setup tailscale repositories
  block:
    - apt_key:
        url: https://pkgs.tailscale.com/stable/ubuntu/jammy.noarmor.gpg
        state: present
    - apt_repository:
        repo: deb https://pkgs.tailscale.com/stable/ubuntu jammy main
        state: present

- name: Update System
  apt:
    update_cache: true
    upgrade: dist

- name: Install ufw and tailscale
  apt:
    pkg:
      - ufw
      - tailscale
    state: latest

- name: Enable and setup ufw
  block:
    - ufw:
        logging: on
    - ufw:
        rule: allow
        port: ssh
        proto: tcp
    - ufw:
        default: deny
        state: enabled

DONE Setup headscale on Server

---
- name: Get the url to download to
  become: no
  local_action:
    ansible.builtin.shell curl "https://api.github.com/repos/juanfont/headscale/releases/latest" | jq -r '.assets[] | select(.name | endswith("amd64.deb")) | .browser_download_url'
  register: headscale_deb_url

- name: Install headscale
  apt:
    deb: "{{ headscale_deb_url.stdout }}"

- name: Check if headscale_hostname set
  fail:
    msg: Set headscale_hostname
  when: headscale_hostname is not defined

- name: Check if tls_email set
  fail:
    msg: Set tls_email
  when: tls_email is not defined
- name: Copy the configuration file over
  template:
    src: config.yaml.j2
    dest: /etc/headscale/config.yaml
    mode: u=rw,g=r,o=r

- name: Enable the headscale service
  systemd:
    enabled: true
    state: started
    name: headscale

- name: Enable Port 443 for HTTPS
  ufw:
    rule: allow
    port: '443'
    proto: tcp

- name: Check if API key exists locally
  become: no
  local_action:
    module: stat
    path: "{{ headscale_env_path }}"
  register: headscale_env_stat
- name: Get API Key
  command: "headscale api create -e 1y -o yaml"
  register: headscale_apikey
  when: headscale_env_stat.stat.exists == false

- name: debug apikey
  debug:
    msg: "hs_apikey: {{ headscale_apikey }}"

- name: write api key locally
  become: no
  local_action:
    module: copy
    content: "{{ headscale_apikey.stdout }}"
    dest: "{{ headscale_env_path }}"
  when: headscale_env_stat.stat.exists == false

Headscale config file

server_url: https://{{ headscale_hostname }}:443

listen_addr: 0.0.0.0:443
metrics_listen_addr: 127.0.0.1:9090

grpc_listen_addr: 127.0.0.1:50443
grpc_allow_insecure: false

private_key_path: /var/lib/headscale/private.key
noise:
  private_key_path: /var/lib/headscale/noise_private.key
ip_prefixes:
  - fd7a:115c:a1e0::/48
  - 100.64.0.0/10
derp:
  server:
    enabled: false

    region_id: 999

    region_code: "headscale"
    region_name: "Headscale Embedded DERP"

    stun_listen_addr: "0.0.0.0:3478"

  urls:
    - https://controlplane.tailscale.com/derpmap/default

  paths: []

  auto_update_enabled: true

  update_frequency: 24h

disable_check_updates: false

ephemeral_node_inactivity_timeout: 30m

node_update_check_interval: 10s

db_type: sqlite3

db_path: /var/lib/headscale/db.sqlite

# TLS
acme_url: https://acme-v02.api.letsencrypt.org/directory
acme_email: "{{ tls_email }}"

tls_letsencrypt_hostname: "{{ headscale_hostname }}"

tls_letsencrypt_cache_dir: /var/lib/headscale/cache

tls_letsencrypt_challenge_type: HTTP-01
tls_letsencrypt_listen: ":http"

## Use already defined certificates:
tls_cert_path: ""
tls_key_path: ""

log:
  # Output formatting for logs: text or json
  format: text
  level: info

# Path to a file containg ACL policies.
# ACLs can be defined as YAML or HUJSON.
# https://tailscale.com/kb/1018/acls/
acl_policy_path: ""

## DNS
#
# headscale supports Tailscale's DNS configuration and MagicDNS.
# Please have a look to their KB to better understand the concepts:
#
# - https://tailscale.com/kb/1054/dns/
# - https://tailscale.com/kb/1081/magicdns/
# - https://tailscale.com/blog/2021-09-private-dns-with-magicdns/
#
dns_config:
  # Whether to prefer using Headscale provided DNS or use local.
  override_local_dns: true

  # List of DNS servers to expose to clients.
  nameservers:
    - 1.1.1.1

  # NextDNS (see https://tailscale.com/kb/1218/nextdns/).
  # "abc123" is example NextDNS ID, replace with yours.
  #
  # With metadata sharing:
  # nameservers:
  #   - https://dns.nextdns.io/abc123
  #
  # Without metadata sharing:
  # nameservers:
  #   - 2a07:a8c0::ab:c123
  #   - 2a07:a8c1::ab:c123

  # Split DNS (see https://tailscale.com/kb/1054/dns/),
  # list of search domains and the DNS to query for each one.
  #
  # restricted_nameservers:
  #   foo.bar.com:
  #     - 1.1.1.1
  #   darp.headscale.net:
  #     - 1.1.1.1
  #     - 8.8.8.8

  # Search domains to inject.
  domains: []

  # Extra DNS records
  # so far only A-records are supported (on the tailscale side)
  # See https://github.com/juanfont/headscale/blob/main/docs/dns-records.md#Limitations
  # extra_records:
  #   - name: "grafana.myvpn.example.com"
  #     type: "A"
  #     value: "100.64.0.3"
  #
  #   # you can also put it in one line
  #   - { name: "prometheus.myvpn.example.com", type: "A", value: "100.64.0.3" }

  # Whether to use [MagicDNS](https://tailscale.com/kb/1081/magicdns/).
  # Only works if there is at least a nameserver defined.
  magic_dns: true

  # Defines the base domain to create the hostnames for MagicDNS.
  # `base_domain` must be a FQDNs, without the trailing dot.
  # The FQDN of the hosts will be
  # `hostname.user.base_domain` (e.g., _myhost.myuser.example.com_).
  base_domain: {{ headscale_hostname }}

# Unix socket used for the CLI to connect without authentication
# Note: for production you will want to set this to something like:
unix_socket: /var/run/headscale/headscale.sock
unix_socket_permission: "0770"

logtail:
  enabled: false

# Enabling this option makes devices prefer a random port for WireGuard traffic over the
# default static port 41641. This option is intended as a workaround for some buggy
# firewall devices. See https://tailscale.com/kb/1181/firewalls/ for more information.
randomize_client_port: false

DONE Headscale on Terraform Ansible

Wait I initially did this in terraform but it should be done in ansible instead… so much easier.

The 3 users created are

  • p for personal (My laptop, phones, etc),
  • s for servers (nomad / etc)
  • i for infra (my proxmox hosts)
install_users: ['p', 's', 'i']
headscale_env_path: "{{ playbook_dir }}/../headscale.env"
---
- name: Retrieve the list of existing users
  command: headscale users list -o json-line
  register: users

- name: Install users
  command: "headscale users create {{ item }}"
  loop:
    "{{ install_users | difference(users.stdout|from_json is none|ternary([], users.stdout|from_json|json_query('[].name'))) }}"
    # a bit of json parsing and handling to only install users that have not been installed
- name: check if headscale env exists locally
  become: no
  local_action:
    module: stat
    path: "{{ headscale_env_path }}"
  register: headscale_env_stat

- name: Get authkey for each user
  command: "headscale authkey create --reusable -e 1y -o json -u {{ item }}"
  register: user_authkeys
  loop: "{{ install_users }}"
  when: headscale_env_stat.stat.exists == false

- name: debug file contents
  debug:
    msg: "{{ user_authkeys.results | map(attribute='stdout') | map('from_json')|json_query('[].{key: key, user: user}')|to_yaml(indent=2) }}"
  when: headscale_env_stat.stat.exists == false

- name: Write the retrieved api keys to local
  become: no
  local_action:
    module: copy
    content: "{{ user_authkeys.results | map(attribute='stdout') | map('from_json')|json_query('[].{key: key, user: user}')|to_yaml }}"
    dest: "{{ headscale_env_path }}"
  when: headscale_env_stat.stat.exists == false

DONE figure out how to write the authkeys to a file

DONE Setup Tailscale on Server

- name: Connect to the tailscale network
  command: "tailscale up --force-reauth --auth-key {{ auth_key }} --login-server https://{{ hostname }}:443"

Setup Headscale users

data "local_file" "hs_apikey" {
  filename = "${path.module}/headscale.env"
  depends_on = [ ansible_playbook.setup_offsite ]
}

module "headscale" {
  source = "./modules/headscale"
  apikey = data.local_file.hs_apikey.content
  endpoint = cloudflare_record.ts.hostname
}
variable "apikey" { type=string }
variable "endpoint" { type=string }
terraform {
  required_providers {
    headscale = {
      source = "awlsring/headscale"
      version = "0.1.5"
    }
  }
}

provider "headscale" {
  endpoint = "https://${var.endpoint}"
  api_key = var.apikey
}

resource "headscale_user" "server" {
  name = "s"
}
resource "headscale_user" "personal" {
  name = "p"
}
resource "headscale_user" "infra" {
  name = "i"
}

resource "headscale_pre_auth_key" "server" {
  user = headscale_user.server.name
  reusable = true
  time_to_expire = "1y"

}
resource "headscale_pre_auth_key" "infra" {
  user = headscale_user.infra.name
  reusable = true
  time_to_expire = "1y"
}

output "server_key" {
  value = headscale_pre_auth_key.server
}
output "infra_key" {
  value = headscale_pre_auth_key.infra
}
resource "ansible_playbook" "setup_tailscale" {
  playbook = "ansible/setup-tailscale.yml"
  replayable = false
  extra_vars = {
    hostname = cloudflare_record.ts.hostname
    auth_key = module.headscale.infra_key.key
    created_username = var.username
  }
  name = each.key
  for_each = toset(concat(var.proxmox_servers, tolist([digitalocean_droplet.infranut_SGP1.ipv4_address])))
}
---
- hosts: all
  remote_user: "{{ created_username }}"
  become: true
  roles:
    - role: roles/tailscale
      ts_user: i

Proxmox Nomad Servers

Create VMs on proxmox.

resource "proxmox_vm_qemu" "nomad-server" {
  for_each    = toset(["eagle", "falcon"])
  name        = "nomad-server-${each.key}"
  target_node = each.key
  clone       = "ubuntu-2204-cloud-init"
  agent       = 1
  full_clone  = true
  onboot      = true

  tags = "nomad-server"

  memory = 2048
  cores  = 2
  scsihw = "virtio-scsi-single" # If i dont have this, the defaults override the cloned info

  qemu_os = "l26"

  sshkeys = digitalocean_ssh_key.yadunut.public_key
  ipconfig0 = "ip=dhcp,ip6=dhcp"
  ciuser = var.username

  network {
        bridge    = "vmbr0"
        firewall  = true
        link_down = false
        model     = "virtio"
        mtu       = 0
        queues    = 0
        rate      = 0
        tag       = -1
    }
  lifecycle {
    ignore_changes = [disk, network]
  }
}

Setup VMs on Proxmox with ansible

---
- hosts: all
  remote_user: "{{ created_username }}"
  become: true
  roles:
    - role: roles/common
    - role: roles/tailscale

Run ansible on those VMs

resource "ansible_playbook" "setup_proxmox_servers" {
  playbook = "ansible/setup-proxmox-servers.yml"
  replayable = false
  extra_vars = {
    hostname = cloudflare_record.ts.hostname
    auth_key = module.headscale.server_key.key
    created_username = var.username
  }
  name = each.value.default_ipv4_address
  for_each = proxmox_vm_qemu.nomad-server
}

Setup Proxmox Nomad Clients

resource "proxmox_vm_qemu" "nomad-client" {
  for_each    = { for val in setproduct(["falcon", "eagle"], [1, 2]): "${val[0]}-${val[1]}" => val }
  name        = "nomad-client-${each.key}"
  target_node = each.value[0]
  clone       = "ubuntu-2204-cloud-init"
  agent       = 1
  full_clone  = true
  onboot      = true

  tags = "nomad-client"

  memory = 2048
  cores  = 2
  scsihw = "virtio-scsi-single" # If i dont have this, the defaults override the cloned info

  qemu_os = "l26"

  sshkeys = digitalocean_ssh_key.yadunut.public_key
  ipconfig0 = "ip=dhcp,ip6=dhcp"
  ciuser = var.username

  network {
        bridge    = "vmbr0"
        firewall  = true
        link_down = false
        model     = "virtio"
        mtu       = 0
        queues    = 0
        rate      = 0
        tag       = -1
    }
  lifecycle {
    ignore_changes = [disk, network]
  }
}
---
- hosts: all
  remote_user: "{{ created_username }}"
  become: true
  roles:
    - role: roles/common
    - role: roles/tailscale
    - role: geerlingguy.docker
      docker_users:
        - "{{ created_username }}"
resource "ansible_playbook" "setup_proxmox_clients" {
  playbook = "ansible/setup-proxmox-clients.yml"
  replayable = false
  extra_vars = {
    hostname = cloudflare_record.ts.hostname
    auth_key = module.headscale.server_key.key
    created_username = var.username
  }
  name = each.value.default_ipv4_address
  for_each = proxmox_vm_qemu.nomad-client
}

Ansible Inventory

This is the point where all the automated stuff goes out of the window. Firstly create an inventory file with the tailscale addresses generated from above. replace offsite/server?/client? with the tailscale urls of the server

---
consul_instances:
  hosts:
    offsite:
      consul_node_role: bootstrap
      nomad_node_role: both
    server1:
      consul_node_role: server
      nomad_node_role: server
    server2:
      consul_node_role: server
      nomad_node_role: server
    client1:
      consul_node_role: client
      nomad_node_role: client
    client2:
      consul_node_role: client
      nomad_node_role: client
    client3:
      consul_node_role: client
      nomad_node_role: client
    client4:
      consul_node_role: client
      nomad_node_role: client

Consul

---
- hosts: consul_instances
  remote_user: yadunut
  become: true
  roles:
    - role: ansible-consul
      consul_version: latest
      consul_raw_key: "{{ lookup('env', 'CONSUL_RAW_KEY') }}"
      consul_iface: tailscale0
      consul_client_address: "0.0.0.0"

Nomad

---
- hosts: consul_instances
  remote_user: yadunut
  become: true
  roles:
    - role: roles/nomad
      nomad_iface: tailscale0
      nomad_group_name: consul_instances
      nomad_docker_enable: true
      nomad_use_consul: true
      nomad_gossip_key: "{{ lookup('env', 'NOMAD_GOSSIP_KEY') }}"
---
nomad_config_path: "/etc/nomad.d"
nomad_data_path: "/opt/nomad"

nomad_user: nomad
nomad_group: nomad
nomad_datacenter: dc1

nomad_systemd_unit_path: "/etc/systemd/system"

nomad_node_name: "{{ inventory_hostname_short }}"

nomad_bind_address: "{{ hostvars[inventory_hostname]['ansible_'+ nomad_iface ]['ipv4']['address'] }}"
nomad_advertise_address: "{{ hostvars[inventory_hostname]['ansible_' + nomad_iface]['ipv4']['address'] }}"

nomad_consul_address: "localhost:8500"

nomad_bootstrap_expect: "{{ nomad_servers | count or 3 }}"
nomad_group_name: "nomad_instances"
---
_nomad_node_client: "{{ (nomad_node_role == 'client') or (nomad_node_role == 'both') }}"
_nomad_node_server: "{{ (nomad_node_role == 'server') or (nomad_node_role == 'both') }}"
---
- name: Expose bind_address, advertise_address and node_role as facts
  set_fact:
    nomad_bind_address: "{{ nomad_bind_address }}"
    nomad_advertise_address: "{{ nomad_advertise_address }}"
    nomad_node_role: "{{ nomad_node_role }}"
    nomad_datacenter: "{{ nomad_datacenter }}"

- name: Add Nomad group
  group:
    name: "{{ nomad_group }}"
    state: present

# Add user
- name: Add Nomad user
  user:
    name: "{{ nomad_user }}"
    comment: "Nomad user"
    group: "{{ nomad_group }}"
    system: true

- name: Add Nomad user to docker group
  user:
    name: "{{ nomad_user }}"
    groups: docker
    append: true
  when:
    - _nomad_node_client | bool

- name: Install Nomad
  apt:
    pkg:
      - nomad
    state: latest

- name: Create directories
  file:
    dest: "{{ item }}"
    state: directory
    owner: "{{ nomad_user }}"
    group: "{{ nomad_group }}"
    mode: "0755"
  with_items:
    - "{{ nomad_data_path }}"

- name: Create config directory
  file:
    dest: "{{ nomad_config_path }}"
    state: directory
    owner: root
    group: root
    mode: 0755

- name: Common Configuration
  template:
    src: nomad.hcl.j2
    dest: "{{ nomad_config_path }}/nomad.hcl"
    owner: root
    group: root
    mode: 0644

- name: Server configuration
  template:
    src: server.hcl.j2
    dest: "{{ nomad_config_path }}/server.hcl"
    owner: root
    group: root
    mode: 0644
  when:
    - _nomad_node_server | bool

- name: Client configuration
  template:
    src: client.hcl.j2
    dest: "{{ nomad_config_path }}/client.hcl"
    owner: root
    group: root
    mode: 0644
  when:
    - _nomad_node_client | bool

- block:
    - name: systemd script
      template:
        src: "nomad_systemd.service.j2"
        dest: "{{ nomad_systemd_unit_path }}/nomad.service"
        owner: root
        group: root
        mode: 0644
      register: nomad_systemd_file
    - block:
      - name: reload systemd daemon
        systemd:
          daemon_reload: true
      - name: Enable nomad at startup (systemd)
        systemd:
          name: nomad
          enabled: yes
      when: nomad_systemd_file.changed
  when: ansible_service_mgr == "systemd"

- name: Start Nomad
  service:
    name: nomad
    enabled: true
    state: restarted
[Unit]
Description=Nomad
Documentation=https://nomadproject.io/docs/
Wants=network-online.target
After=network-online.target

[Service]
{% if _nomad_node_client %}
User=root
Group=root
{% else %}
User={{ nomad_user }}
Group={{ nomad_group }}
{% endif %}


ExecStart=/usr/bin/nomad agent -config {{ nomad_config_path }}
EnvironmentFile=-/etc/nomad.d/nomad.env
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
KillSignal=SIGINT
LimitNOFILE=65536
LimitNPROC=infinity
Restart=on-failure
RestartSec=2

TasksMax=infinity
OOMScoreAdjust=-1000

[Install]
WantedBy=multi-user.target
data_dir = "{{ nomad_data_path }}"

name = "{{ nomad_node_name }}"
datacenter = "{{ nomad_datacenter }}"

bind_addr = "{{ nomad_bind_address }}"

enable_syslog = true

advertise {
    http = "{{ nomad_advertise_address }}"
    rpc = "{{ nomad_advertise_address }}"
    serf = "{{ nomad_advertise_address }}"
}

consul {
    # The address to the Consul agent.
    address = "{{ nomad_consul_address }}"
    # The service name to register the server and client with Consul.
    server_service_name = "nomad-server"
    client_service_name = "nomad-client"

    # Enables automatically registering the services.
    auto_advertise = true

    # Enabling the server and client to bootstrap using Consul.
    server_auto_join = true
    client_auto_join = true
}

ui {
  enabled = true
}
server {
  enabled = true
  encrypt = "{{ nomad_gossip_key }}"
  bootstrap_expect = 3 # I'm too lazy to figure out how to dynamically derive this number from nomad_node_role == 'server'
}
client {
  enabled = true
}

plugin "docker" {
  config {
    volumes {
      enabled = true
    }
  }
}

V2

  1. Have control server with headscale / my own wireguard management solution

    • Features needed:
    • Groups
    • PSK sharing
    • server for managing data
    • support for multi-servers
    • client to connect to wireguard (no distinction between server and client?)
    • Health checks