Aikido, programming, system administration, and other things I find interesting

DNS workgroup update

(DNS WG at #ripe64)

RIPE report

  • Ripe runs K-root, 18 instances, enabled IPv6, migrating to Centos6, IPv6 traffic is growing.
  • They have 117 signed zones.
  • DNSSec growth in reverse zones is steady, but they are still at about 1%.
  • They have 77 ccTLDs, and they have IPv6 glue for every root zone.
  • DNSMON will be migrating to the Atlas infrastructure.
  • RIPE NCC will probably not run DNS secondary services for the new generic TLDs, unless the RIPE NCC community changes it’s mind quite radically (they also pushed out the bigger ccTLDs).

Signing .si Before and After

  • Low interest for DNSSEC in Slovenia
  • No problems wtih CPEs reported
  • Around 40% of queries have DO bit set, reply lengths still remain below 512 bytes.
  • They use OpenDNSSEC
  • No problems were detected, just some (expected) rise in traffic.
  • Currently only 20 delegations signed, no interest from banks or government institutions.
  • Need to raise DNSSEC awareness among DNS community in Slovenia.

dnSSexy – a verifying DNS (sec) proxy

  • Prevents bogus zones from getting out in the wild.
  • There are a number of verifiers available, but they either check a zone file before it is served, however, that is not always possible.
  • dnSSExy is just another nameserver that sits between a hidden master and public slaves.
  • Once it is notified it transfers, but does not serve yet. First it verifies the zone, and if it passes it notifies the public slaves.
  • It is implemented using NSD
  • DNSSexy is a fork of NSD and will be available as a separate package

Quality measurements of DNSSEC in .se

  • Started by measureing quality and reachability of DNS in Sweden
  • They publish a yearly report on all their findings, including measurements and issues.
  • They built a DNS health check system, collects data from a set of (912) domains
  • They had a DNSSEC campaign, where they lowered the prices for DNSSEC domains.
  • Helped the DNS operators by checking their zones, regular reports on DNS errors and special DNSSEC error reports sent to their customer support.
  • They developed a tool for quickly analyzing DNS signed domains
  • They are currently testing 6% of DNSSEC domains, which is a lot.
  • They find that a lot of users just use the default values of their DNS signing tools. But a number of tools use very bad default values.

DNSSEC and dealing with hosts that don’t get fragments

Surfnet signed their main domain in 2010, and immediately ran into trouble. The largest ISP in NL, because their customers could not resolve the .NL zone. The ISP had an ancient firewall, nobody dared toouch it, and it was blocking UDP fragments. Since the zone is signed, the responses often exceed the MTU. They can see ICMP fragment timed out packets, and thus detect people who don’t get fragments. Around 60% of hosts have EDNS0 enabled. About 90% of hosts advertise a 4K buffer zone, and only very few have chaned their setting to actual measured values. The vast majority sets DO=1.

Mitigation: lower EDNS0 buffer size to 1232, or detect problem hosts dynamically and just change EDNS buffer size for hosts with known issues. Heuristic approach, 5 rules,

  • If they see ICMP FRTE
  • EDNS0 header toggled on/off
  • Excessive retries during TTL of record
  • Changing EDNS0 bufer size in queries
  • Fallback to TCP without truncation

Any of these may be an indication of a problem.

They prepared a tool which will be released as OS, which can modify EDNS buffer on incoming requests.

Currently about 2% of customers are unable to cope with DNSSEC signed zones.

Looking at TLD DNSSEC practices

82 out of 303 TLDs sign


KNOT DNS update

Open source authoritative-only DNS nameserver. Version 1.0 is a testing release, they expect 1.1 to be production ready. They plan to write the documentation soon.

Their benchmarks look very impressive, but they compiled them themselves.

They show much better results from running under FreeVSD than under Debian, regardless of DNS server software.


  • Policy driver configuration using KASP
  • Out of the box support for PKCS #11 (No dependencies on OpenSSL)
  • Key sharing between zones
  • Scales to 50,000 zones

They have updated documentation and improved issue tracking

Current recommended version is 1.3, it is a multithreaded signer. Upgrade from previous versions is easy. They expect 1.3 to be stable for a long time.

1.4 will be released mid-summer, will spport AXFR/IXFR, will increase memory footprint. They will drop the integrated auditor . The source is still available, but support only for paying customers. (The recommend dnssexy for verification instead).

2.0 planne for Q4 2012, refactoring the enforcer, support for rollover, support for combined signing key, support for unsigned zones, incremental transition between NSEC & NSEC3.

Beyond that, database IO. dynamic updates, improved CLI, common API for system integration.

They provide on-site training in various places, and if you come to Stockholm it is free. Complete study material is available for free.


DNS measurements using ATLAS

Small probes in various networks that make measurements and report home. The data are available from RIPE atlas website after you log in. They are accepting applications for more kinds of measurements. Will soon support filters to probes, like probes in my AS, probes in my country, etc.

Atlas will soon be able to do everything done by DNSMON, and they forsee a future when it replaces DNSMON.


DNS Anycast deployment of DNS authority servers

ICANN on how they provision and deploy L root. A long time ago they were limited to one server per RR entry, but with anycast that limitation is gone. It allows servers to come closer to users. That also makes it easier to take malfunctioning servers offline, and it keeps attacks closer to the source, which protects the rest of the cloud from the attack. L root has been anycast since 2007, originaly three nodes, ten servers in a location with a rooter. So they switched to a large number of small, cheap nodes.

Usually root servers are at IXP locations, but L root is now going into eyeball networks directly. It is a single box solution, they can roll out as virtual machines operated by PCH (usually 4 virtual machines to match the capacity of one physical server). Currently 90 servers, and 150 VMs in almost 100 locations.

They moved from Centos to Ubuntu, use Debian package management. Fully automated install, automated administration with Puppet. Can treat the whole cloud as one system with one set of controls.

Puppet is open source IT automation software

Ensures all systems have correct packages and configuration.

Single config file per node.

They use DRAC cards for remote setup.

For monitoring: intermapper, puppet is well integrated with nagios, they will migrate nagios for alerts.  They use Observium for monitoring, currently that is manually configured.

They monitor traffic with DSC, every server collects stats locally (because DSC can not support hundreds of servers).

Related Posts

Why is my munin slow and how to speed it up

At $work we are monitoring a network of hundreds of servers, and that means that we end up recording hundreds of thousands of variable values every five minutes. After a while, the server started slowing down, taking more than 300 seconds to collect the data. Since it has a whole-system lock, that means the next […]

Read More

A munin plugin to monitor each CPU core separately

Monitoring each core separately may seem like a waste – after all, we have an overall CPU usage already available under “system” in munin, isn’t that enough? It turns out that it isn’t. Sometimes, when using top on a multicore/multicpu machine, you can see a process pegged at 100%, while other processes are comfortably using […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *