Tue Nov 5 10:40:51 2013
Emacs registers as bookmarks
- I spend so much time in emacs, and am constantly needing to navigate to certain files and directories.
- It's so easy to get lost in navigation to what you were looking for.
- Given I've got over a hundred buffers going at any time, this has saved me a lot of effort.
(set-register ?i '(file . "~/.emacs.d/init.el"))
- Anytime I want to get to my init file I just
C-x r j o
Sun Oct 27 22:51:54 2013
Monitoring the ever changing cloud.
- Autoscaling in AWS creates an interesting use case for most monitoring tools out there. (i.e. Nagios/Icinga)
- With this in mind, I decided to implement a monitoring system that removed the alert conditions from the endpoint.
- Setup each node to export the information of interest, and let one or more services decide if an alert is feasible.
- So we created Pinky. It builds on OpenResty (nginx + lua and others).
- Write rest controllers to provide the information in json format.
- A collector written in Go-lang can pull down all of the json
Pinky - The agent
The Pinkies, and their purposes:
- chef: return 'last cooked' time.
- disk: return df(1) output.
- dpkg: Out standing security updates
- ec2meta: return all ec2metadata from any node.
- hello: Hello world (used for test)
- load: Return the output of load
- log: Return the last # lines of a log.
- memcache: Memcache monitoring
- memfree: "free -m" output
- mydb: Mysql Slave delay monitor.
- netstat: netstat(1) output
- nginx: Nginx log parser for returning all lines matching a given date. (under development)
- passenger: Return passenger-status output bypassing ruby.
- ping: Ask a given node to ping another host 1 time.
- port: Ask a given node to check a port via nc(1)
- proc: Test pinky for walking the /proc tree.
- process: Return the full process tree.
- redis: Query info; as well as other arbitary values.
- runit: Return the status of all runit(1) services.
- rvm: Return rubies, their gems, and versions.
- stat: return fstat(2) on any file.
- unicorn: Verify if any unicorns are running on older code.
- vmstat: Return vmstat(1) output.
Labrat - The collector
- The collector is available here
- Usage is as follows:
ubuntu@labrat:/tmp/temp$ /data/labrat/labrat --help Usage of /data/labrat/labrat: -c=1: number of parallel requests -p="44444": default pinky port -s="./servers.txt": List of servers to hit -u="/pinky/disk": url to hit
- Using
ec2din
we generate a servers.txt containing all of instances (internal names) in the current directory.
ubuntu@labrat:/tmp/temp$ cat servers.txt|wc -l 322
- Now let's query the disk endpoint on all of them, saving each
into a
host.monitor.json
file.
ubuntu@labrat:/tmp/temp$ time /data/labrat/labrat -c=100 -u="/pinky/disk" real 0m4.311s user 0m0.044s sys 0m0.144s
- Each file looks like
$ cat ip-10-112-12-9.ec2.internal-disk.json {"status":{"error":"","value":"OK"},"data":{"\/run":["tmpfs","1525900","188","1525712","1%"],"\/dev":["udev","3806708 ","12","3806696","1%"],"\/run\/lock":["none","5120","0","5120","0%"],"\/run\/shm":["none","3814740","0","3814740","0%"],"\/":["\/dev\/xvda1","82569904","15344892","63031632","20%"],"\/mnt":["\/dev\/xvdb","433455904","221932","411215668","1%"]},"system":{"name":"prod-lin-app-s2-i-3ad14d7","time":1382940141}}
- Given a working directory with hundreds of json files, one can easily see the potential.
Can my app servers reach mysql?
- Another example showing how I can find network partitions that a centralized monitoring system may not see.
$ time /data/labrat/labrat -c=200 -u="/pinky/ping/181.72.12.28" real 0m4.242s user 0m0.196s sys 0m0.308s $ cat ip-10-81-56-37.ec2.internal-ping.json # A fail {"ip":"181.72.12.28","system":{"name":"prod-lin-app-s1-i-ba8d191e","time":1382945706},"status":{"error":"100%% packet loss,","value":"FAIL"},"data":"PING 181.72.12.28 (181.72.12.28) 56(84) bytes of data.--- 181.72.12.28 ping statistics ---1 packets transmitted, 0 received, 100% packet loss, time 0ms"} $ cat ip-10-81-21-87.ec2.internal-ping.json # A pass {"ip":"181.72.12.28","system":{"name":"prod-lin-app-s1-i-fa8f099e","time":1382946218},"status":{"error":"","value":"OK"},"ping_time":"26.4","data":"PING 181.72.12.28 (181.72.12.28) 56(84) bytes of data.64 bytes from 216.34.181.45: icmp_req=1 ttl=234 time=26.4 ms--- 181.72.12.28 ping statistics ---1 packets transmitted, 1 received, 0% packet loss, time 0msrtt min\/avg\/max\/mdev = 26.478\/26.478\/26.478\/0.000 ms"}
- This can also monitor changes in security groups and network rules that allow/disallow access to a given host from a given group.
- We also have a port endpoint to check specific services.
- Processing the json is trivial and allows for easy integration into multiple projects.
Icinga - Modified to read our directory of json
- Icinga is able to gather the data much faster via local disk read than nrpe calls.
- The load on the Icinga server is much lower.
- Using a
tmpfs
drive to store all the json helps us not run out of space and gives us a performance bump. - Server and group configs are automatically generated and reloaded once a minute. This prevents autoscaling from causing alerts.
Librato - Long term metrics storage.
- Using the
directory of json
we can service multiple needs. - Monitors can use historical data from Librato, as well as the
current state. (kept in the
directory of json
)
Building complex monitors
- Using
Cucumber
, we can build feature files that consider multiple conditions before alerting.
Feature: Memcache connectivity Scenario: Memcache error rate due to network connectivity Given that the app is reporting memcache issues And memcache monitor is not reporting memcache errors And network connectivity between the app and memcache is elevated Then send alert to ops about ec2 network issue.
Current status:
- Pinky and Pinky-server are both actively developed.
- New endpoints get created as needed.
- Developers are able to help build out requirements for new projects.
Other notes
- Lua is fast. (duh)
- Nginx is a known entity and has proved to be robust.
- Much easier to debug endpoints using
curl(1)
orpinky
command (vs nrpe) - We avoid DRY with metrics, monitoring, and system acceptance tests combined.
Sat Oct 26 23:28:42 2013 :wireshark: memcache:
Debugging memcache in realtime remotely with Wireshark and command line.
- First ensure you have tshark installed (assuming OS X)
brew install wireshark
- Next, build a script to display all of the memcache values on
the command line output.
$ tshark -G|awk -v f="'" 'BEGIN{ print "ssh \$1 \"tcpdump -w- -s0|gzip\" | tshark -r- -Nn -tad -R \"memcache\" \\" } ($3 ~ "memcache") {print " -z \"proto,colinfo," $3 "," $3 "\" \\" } END{ print "|LANG=C sed -e " f "s# == ##g" f " -e " f "s#memcache.##g" f }' > read-memcache $ chmod a+rx read-memcache $ read-memcache some-remote-memcacheserver.mydomain.com
How it works.
- 'tshark -G' generates all of the variables available to dissectors.
- Using these we build the display list for each.
- Running it we get output such as this.
1 2013-10-26 23:38:54.062904 10.64.112.196 -> 10.124.95.242 MEMCACHE 123 get linbsd2:breaking_news_banners:tag:site-wide command"get" key"linbsd2:breaking_news_banners:tag:site-wide" 4 2013-10-26 23:38:54.062992 10.111.95.242 -> 10.64.157.196 MEMCACHE 71 END response"END" 6 2013-10-26 23:38:54.063428 10.212.199.159 -> 10.124.95.242 MEMCACHE 113 get linbsd2:views/tag_smurfs_217 command"get" key"linbsd2:views/tag_smurfs_217" 7 2013-10-26 23:38:54.063459 10.124.125.142 -> 10.212.199.159 MEMCACHE 674 VALUE linbsd2:views/tag_smurfs_217 0 546 flags0 value"\x04\x08\"\x02\x1c\x02<!-- \"tag_smurfs_217\" cached 10/26/2013 at 06:03 PM -->\x0a\x09\x09\x09... 10 2013-10-26 23:38:54.065539 10.14.150.255 -> 10.124.95.242 MEMCACHE 119 get linbsd2:views/tag_smurfs_16_mobile command"get" key"linbsd2:views/tag_smurfs_16_mobile" 11 2013-10-26 23:38:54.065575 10.14.25.222 -> 10.214.150.255 MEMCACHE 71 END response"END"
- We can filter for errors with augmentations to the -R "filter"
- Will add more filters later on for spotting memcache errors other than "missing key"
2013-10-26 Sat
Rejoined NetBSD to help work on #lua in the kernel. Want to extend Pinky to work with the tinyhttpd that mbalmer extended with lua support.