Tag Archives: openwrt

ICMP Watchdog in the Ubiquiti Networks devices

About watchdog

I am using wireless devices from the Ubiquiti Networks. Usually everything works fine, but in rare cases of software/hardware bug it would be great to automatically restart device when needed. AirOS provides this functionality, it is called “ping watchdog” and is located in the web interface, “services” tab. However there is no a lot of documentation about how it works, so i decided to research this. Screenshot of the watchdog interface with default values provided below: Screen Shot 2016-07-18 at 08.38.52.

Under the hood

Ubnt AirOS is OpenWRT based OS with ssh enabled, so we can ssh to the device to find how this watchdog works. If ping watchdog is enabled in the web interface you should see something like this in the process list:

/bin/pwdog -d 300 -p 300 -c 3 -m 300 -e /bin/support /tmp/emerg /etc/persistent/emerg.supp emerg 0; reboot -f 192.168.1.1

This “pwdog” service is a custom busybox applet which is based on busybox ping implementation with modifications to implement watchdog functionality. I been able to find it source code on the github.

So there is detailed description of the pwdog service logic:

  1. On system start it waits -d seconds (300 by default), to allow initialization of the hardware and software. I would not recommend to reduce this value, or you will have a chance that device will never start. In the web interface it is “Startup Delay:” value.
  2. After initial delay it will send ICMP ping to the specified host (last parameter) and will wait -p seconds (300 by default, “Ping Interval:” in the web interface). After this step 2 will be repeated.
  3. If there is no reply -c times (by default – 3) pwdog will run command specified in the -e argument (/bin/support /tmp/emerg /etc/persistent/emerg.supp emerg 0; reboot) or just reboot if it is not specified. In this example watchdog also saves support information. In the web interface you can modify this value using “Failure Count To Reboot.:” parameter.
  4. There is also -m parameter which defines low memory threshold. It is enabled by default and is not configurable via web interface.

Below i tested how it works in the command line, with modified parameters:

XM.v5.6.6# /bin/pwdog -d 1 -p 3 -c 3 -m 300 -e /usr/bin/echo -v 192.168.1.1
pwdog[993]: pwdog: do_now=0, initial_sleep=1, timeout=3, retry_count=3, low_mem=300 exec=`/usr/bin/echo`
pwdog[993]: PING Watchdog is checking 192.168.1.1 (192.168.1.1).
pwdog[993]: Missed 1 ping replies in a row.
pwdog[993]: Missed 2 ping replies in a row.
pwdog[993]: Missed 3 ping replies in a row.
pwdog[993]: 4 ping replies missed. Executing `/usr/bin/echo`.

Conclusion

ICMP watchdog in AirOS is not a very smart service and default configuration does not look optimal for me – in fact its enough to miss only 3 ICMP packets to start reboot process. Also it will fire only after 15 (300*3) minutes of the link failure. So i would probably recommend to increase number of counts and decrease ping interval. Also i am thinking about porting apinger to this device, because it provides much more advanced icmp check functionality.

Advertisements
Tagged , , ,

Upgrading TP-Link Archer C7 AC1750 to use with OpenWRT

Why OpenWRT?

One of my home access points is TP-Link Archer C7. I purchased it to get all benefits of the 5Ghz 802.11ac standard for the laptop and 2.4Ghz band for the older devices. However, it was never working for me well:

  • In 5Ghz band Apple devices were working very unstable
  • Sometime i had to reboot router because of wifi stability issues. After reboot it was working until next issue. There are no debug options/logs in the native firmware.
  • Device was spamming network with STP packets and some other data, no way to disable.
  • After upgrading to the new firmware versions i had to reconfigure it completely. And in fact difference between regullary updated versions was minimal
  • Native firmware configurable only via web interface, probably backdoors are included 🙂

So i decided to reflash it to the OpenWRT and found, that i am “happy” owner of the TP-Link Archer C7v1, with AR1A (v1) variant of QCA9880 chip, not supported in the open source ath10k driver. So there is no way to use 5Ghz with OpenWRT at all. Only good thing that 5Ghz chip is not soldered on the board, but connected to the PCIe mini card socket. So i decided to replace it.

Router upgrade

  • I been able to find on the eBay Compex WLE900VX Atheros QCA9880 card. It supports 802.11AC 1.3Gbps 3×3 MIMO 5ghz and is supported by ath10k driver.
  • Before replacing WIFI card you should install OpenWRT or device wont boot at all. I used OpenWRT CC 15.05 for the Archer C7 V1.X, upgrade was done via web interface
  • After OpenWRT is up and running – turn off device and replace WiFi card. Be careful with pigtails, it is very easy to damage them.
  • OpenWRT recognized this card without any additional packages and now working well. You may also want to use alternate firmware from Candela Technologies, there are some reports that it works better then one from vendor.

Limitations

  • Hardware NAT is not supported. I am not using NAT on it, so i dont really care. Probably on speeds up to 300Mbit it does not matter.
  • Device has only 8Mb of flash. It is enough for the OpenWRT installation (including Luci). There are also 2 USB2 ports, so its easy to extend storage size if needed.

Results

So far everything works great. It is too early to say if stability issues are gone or not, but at least i am now able to do full debug and tuning if needed. I am planning to benchmark router later.

Tagged , ,

LXC on OpenWRT/Turris presentation

Slides from my presentation @ Turris Evening by cz.nic about LXC in OpenWRT/Turris. Video will follow soon, if you are interested.

 

Update: video from the presentation:

Tagged , , ,

Monitoring WAN status on OpenWRT using Alarm Pinger

Idea

I am connected to the Internet using wireless link which is sometime not very stable. I decided to monitor status of the link to make sure that I am aware of the problem. Initially i tried to monitor link with Monit or Nagios + fping, but results were not very good, this software is not designed for continues monitoring with very small interval. So I decided to find some alternatives.

About Alarm Pinger

I was using Alarm Pinger (apinger) with pfSense distribution — it was used to monitor WAN links to switch between them if needed.

Alarm Pinger (apinger) is a little tool which monitors various IP devices by simple ICMP echo requests. There are various other tools, that can do this, but most of them are shell or perl scripts, spawning many processes, thus much CPU-expensive, especially when one wants continuous monitoring and fast response on target failure. Alarm Pinger is a single program written in C, so it doesn’t need much CPU power even when monitoring many targets with frequent probes. Alarm Pinger supports both IPv4 and IPv6.

This tool supports multiply monitoring targets, external scripts, email notification, daemon mode. Only problem was that tool was not available as OpenWRT package. So i decided to port it.

OpenWRT port

After few tests I found, that code can be compiled with only few minor patches (autoconf related). You can grab Makefile for package from this pull request. Hopefully it will be integrated in the official packages feed soon. Update: port merged.
Port provides init.d script and sample configuration. In the feature I am also planning to make Luci integration to show link status from the web interface.

To buid package on Turris I would recommend to use my turris buildroot docker image.

## Service configuration

I am using very simple configuration to monitor status of the Wireless link using pings to the ISP gateway:

# we need to use root because "rainbow" tool fails to work from other uid. 
user "root"
group "root"

# status file with link quality information
status {
    file "/tmp/apinger.status"
    interval 1s
}
# command to run, with alarm type and reason
# if used with multiply targets %t needs to be added
alarm default {
    command on "/root/gateway.sh %A %r"
    command off "/root/gateway.sh %A %r"
}
# This alarm will be fired when target doesn't respond for 30 seconds.
alarm down "down" {
    time 30s
}
# This alarm will be fired when responses are delayed more than 80ms
# it will be canceled, when the delay drops below 50ms
alarm delay "delay" {
    delay_low 50ms
    delay_high 80ms
}
# This alarm will be fired when packet loss goes over 5%
# it will be canceled, when the loss drops below 3%
alarm loss "loss" {
    percent_low 3
    percent_high 5
}
target default {
    interval 1s
    avg_delay_samples 10
    avg_loss_samples 50
    avg_loss_delay_samples 20
    alarms "down","delay","loss"
}
# ISP Gateway host to monitor. You can define many targets in case of MultiWAN. 
target "1.2.3.4" {
    description "ISP Gateway"
}

Also I am using simple script to change WAN LED color in case of problems:

#!/bin/sh

DEF_COLOR=33FF33 # see https://gitlab.labs.nic.cz/turris/rainbow/blob/master/turris.c
WARNING_COLOR=FFFF00 # yellow
DOWN_COLOR=red
RAINBOW=/usr/bin/rainbow

logger "event: $@"
# read data from status file
STATUS=`grep  "Active alarms:" /tmp/apinger.status`

case "$@" in
"delay ALARM")
  touch /tmp/apinger.delay.flag
  ;;
"delay alarm canceled")
  rm -f /tmp/apinger.delay.flag
  ;;
"down ALARM")
  touch /tmp/apinger.down.flag
  ;;
"down alarm canceled")
  rm -f /tmp/apinger.down.flag
  ;;
"loss ALARM")
  touch /tmp/apinger.loss.flag
  ;;
"loss alarm canceled")
  rm -f /tmp/apinger.loss.flag
  ;;
esac
# link is down
if [ -e /tmp/apinger.down.flag ]; then
  ${RAINBOW} wan ${DOWN_COLOR}
  exit
fi
# loss or delay
if [ -e /tmp/apinger.loss.flag -o -e /tmp/apinger.delay.flag ]; then
  ${RAINBOW} wan ${WARNING_COLOR}
  exit
fi
# no active alarms found
${RAINBOW} wan ${DEF_COLOR}

This works pretty good – if line is down – WAN color is red, if it is unstable or congested – yellow. We can also monitor link status manually:

root@turris:~# cat /tmp/apinger.status
Fri Apr 10 12:39:24 2015

Target: 1.2.3.4
Description: ISP Gateway
Last reply received: #2876 Fri Apr 10 12:39:23 2015
Average delay: 3.247ms
Average packet loss: 0.0%
Active alarms: None
Received packets buffer: ################################################## ###################.

Todo

I am planning to extend functionality of the script with some cool features:

  • Integrate with Luci to show status in the web interface.
  • Add support for the failover switch to the LTE channel if link is down (and LTE dongle connected).
  • Enable rrdtools support provided by apinger.
Tagged , , , ,

Asterisk g729 codec for the OpenWRT.

Today OpenWRT telephony maintainers committed codec_g729 to the feed. I am already using this package on my home router and it works pretty well. Binary packages should be available soon.

Tagged ,