Remote Work Update

Over a year ago, I left San Antonio and started working from home in Lexington, KY. ( previously) AV Tools Microphone: Still very happy with the Meteor microphone. Camera: Purchased a Logitech HD Portable 1080p Webcam C615 with Autofocus. Works with OSX 10.9.4 well enough. The only pains were reconfiguring various VC software to use the new camera. Hangouts users, beware: every Flash update you must to set your non-default video/mic....

April 27, 2015 · itsahill00

Troubleshooting LLDP

LLDP is a wonderful protocol which paints a picture of datacenter topology. lldpd is a daemon to run on your servers to receive LLDP frames outputs network location and more. There’s also a recently patched lldp Ansible module. Like all tools, using LLDP/lldpd has had some issues. Here’s the ones I’ve seen in practice, with diagnosis and resolution: Switch isn’t configured to send LLDP frames Diagnosing: [code] tcpdump -i eth0 -s 1500 -XX -c 1 ’ether proto 0x88cc' [/code]...

April 7, 2015 · itsahill00

SDN Development Environment

Recently, I began a deep dive into more SDN and OpenFlow. Overall I was very happy with the process and quality of the material out there for newcomers. However, I noticed a gap when I hit my first stumbling block. I set up a mininet instance, noticed it was running Open vSwitch (OVS) v2.0. I needed a newer version of OVS, and turfed the mininet instance while the upgrading OVS. It quickly became apparent that I needed a repeatable development environment setup....

January 5, 2015 · itsahill00

Graphite Events with a Timestamp

There’s a few good posts out there about Graphite Events with how and why to use them. Earlier I was trying to add events to Graphite but ran into an issue: my events used a timestamp in the past. The examples I found only showed publishing events with a ’now’ timestamp. I went digging and found the extension of Graphite to add events - the functionality exists. Just add a ‘when’ to your payload with a Unix timestamp....

December 8, 2014 · itsahill00

My OpenStack Paris Summit Sessions

Capacity Management/Provisioning (Cloud’s full, Can’t build here) - Video, Slides As a service provider, Rackspace is constantly bringing new OpenStack capacity online. In this session, we will detail a myriad of challenges around adding new compute capacity. These include: planning, automation, organizational, quality assurance, monitoring, security, networking, integration, and more. Managing Open vSwitch Across a Large Heterogenous Fleet - Video, Slides Open vSwitch (OVS) is one of the more popular ways to provide VM connectivity in OpenStack....

November 5, 2014 · itsahill00

nsxchecker: Verify the health of your NSX network

Recently I got to work with the NSX API and write a tool to do a quick health check of NSX networks. nsxchecker is a valuable operational tool to quickly report a NSX network’s health. One of the promises of SDN is automated tooling for operational teams and with the NSX API I was quickly able to deliver. nsxchecker accepts a NSX lswitch UUID or a neutron_net_id. Rackspace’s Neutron plugin, quark, tags created lports with a neutron_net_id....

October 7, 2014 · itsahill00

Operating OpenStack: Monitoring RabbitMQ

At the OpenStack Operators meetup the question was asked about monitoring issues that are related to RabbitMQ. Lots of OpenStack components use a message broker and the most commonly used one among operators is RabbitMQ. For this post I’m going to concentrate on Nova and a couple of scenarios I’ve seen in production. It’s important to understand the flow of messages amongst the various components and break things down into a couple of categories:...

September 2, 2014 · itsahill00

Monitoring Edge Node Network Configuration

Over the last few months I’ve done a bit of work around monitoring, Open vSwitch, and XenServer. This post lists some of the networking/Open vSwitch specific items to monitor on hypervisors. Link Status: Nagios SNMP Interfaces plugin works well for reporting a failed link as well as reporting error rates and inbound/outbound bandwidth. Open vSwitch Manager and Controller Status: Transport Node Status is a quick and dirty python script which can be used with extended SNMP to alert when OVS loses a connection to a manager/controller....

July 28, 2014 · itsahill00

On Failure

A couple of interesting research papers around failure, found in The Datacenter as a Computer. Failure Trends in a Large Disk Drive Population (2007) Out of all failed drives, over 56% of them have no count in any of the four strong SMART signals, namely scan errors, reallocation count, offline reallocation, and probational count. In other words, models based only on those signals can never predict more than half of the failed drives....

May 9, 2014 · itsahill00

On Working Remote

In late March I relocated from San Antonio, TX to Lexington, KY. Same awesome job just with a twist…REMOTE WORK! I am mainly collaborating via IRC, tmux, 1:1/M:M TeamSpeak, and 1:1/M:M video conferencing. My takeaways after the first month: Obvious, but it took me a while: When you’re talking, LOOK AT THE CAMERA - multiple monitors and a MacBook make this a little awkward Quality matters: appropriate microphones & video, particularly conference rooms, make a huge difference....

April 29, 2014 · itsahill00