Posts

29 Nov 2018 » Robust Upgrade on Ubuntu, using ZFS and Containers

Introduction Recently I stumbled across a twitter post, highlighting the benefits of ZFS Boot Environments; see here. Next in that thread, it states: Unfortunately, #linux has little to offer with the same functionality as #ZFS, especially with Red Hat abandoning #BTRFS See here. I took this as an implication that it’s not possible to implement a solution similar to ZFS Boot Environments on Linux, which I don’t completely agree with.

23 Apr 2018 » Ubuntu-Based Appliance Build Architecture (slides)

Agenda 1. Overview of current illumos-based build architecture. 2. Problems with current build architecture. 3. Goals of new Ubuntu-based build architecture. 4. Overview of new build architecture. 5. Details of new build architecture. .footnote[*Press “p” for notes, and “c” for split view.] class: middle, center 1 – Overview of current build architecture Overview of current build architecture Converts source code, into virtual machine (VM) artifacts OVA, VHD, qcow2, etc.

01 Dec 2017 » Principles of OS Debugging

The most important bugs occur at customer sites, cause downtime for mission-critical machines, and cannot be deterministically reproduced Customers should not and will not perform experiments and debugging tasks for us We must be able to perform root-cause analysis from a single crash dump and deliver an appropriate fix Debugging support for new subsystems is required at the same time as project integration The above points were taken from slide 5, of Mike Shapiro’s presentation “A Brief Tour of the Modular Debugger” ca.

17 Nov 2017 » Performance Testing Results for ZFS on Linux #6566

The following are links to the Jupyter notebooks that describe the performance testing that I did for ZFS on Linux #6566, and the results of that testing: Max Rate Submit on HDDs Max Rate Submit on SSDs Fixed Rate Submit on HDDs Fixed Rate Submit on SSDs Additionally, a compressed tarball with all the raw data used to generate those Jupyter notebooks can be found here.

24 Oct 2017 » ZIL Performance: How I Doubled Sync Write Speed (slides)

Agenda 1. What is the ZIL? 2. How is it used? How does it work? 3. The problem to be fixed; the solution. 4. Details on the changes I made. 5. Performance testing and results. .footnote[*Press “p” for notes, and “c” for split view.] ??? Here’s a brief overview of what I plan to discuss. It’s broken up into roughly 3 parts: First I’ll give some background, and discuss:

28 Sep 2017 » Generating Code Coverage Reports for ZFS on Linux

Introduction This is another post about collecting code coverage data for the ZFS on Linux project. We’ve recently added a new make target to the project, so I wanted to highlight how easy it is to use this to generate static HTML based code coverage reports, and/or to generate a report that can be used with other tools and services (e.g. codecov.io). Examples Before I get into the specifics for how to run the tests and generate the coverage report, I want to show off the results.

26 Sep 2017 » ZFS on Linux Code Coverage (slides)

Branches + Pull Requests Code coverage data is collected for: All commits merged to a branch (e.g. master) All pull requests for the “zfs” project Code coverage collected after running all tests ztest, zfstest, zfsstress, etc. Data generated using make code-coverage-capture … Emits .info file and static HTML pages .info file uploaded to codecov.io ZFS on Linux + Codecov

19 Sep 2017 » Python + Jupyter for Performance Testing (slides)

Setting the stage. Working on performance improvement to ZFS (sync writes) To verify my changes, I needed to: Measure the performance of the system without my changes. Measure the performance of the system with my changes. Analyze the difference(s) in performance with and without my changes. Collect tangential information from the system, to support (or refute) my conclusions. Visualizations required? While not strictly required, visualizations are often powerful.

18 Sep 2017 » Code Coverage for ZFS on Linux

I’ve been working with Brian Behlendorf on getting code coverage information for the ZFS on Linux. The goal was to get code coverage data for pull requests, as well as branches; this way, we can get a sense of how well tested any given PR is by the automated tests, prior to landing it. There’s still some wrinkles that need to be ironed out, but we’ve mostly achieved that goal by leveraging codecov.

11 Sep 2017 » Using "gcov" with ZFS on Linux Kernel Modules

Building a “gcov” Enabled Linux Kernel In order to extract “gcov” data from the Linux kernel, and/or Linux kernel modules, a “gcov” enabled Linux kernel is needed. Since my current development environment is based on Ubuntu 17.04, and the fact that Ubuntu doesn’t provide a pre-built kernel with “gcov” enabled, I had to build the kernel from source. This was actually pretty simple, and most of that process is already documented here.

08 Sep 2017 » Performance Testing Results for OpenZFS #447

The following are links to the Jupyter notebooks that describe the performance testing that I did for OpenZFS #447, and the results of that testing: Max Rate Submit on HDDs Max Rate Submit on SSDs Fixed Rate Submit on HDDs Fixed Rate Submit on SSDs Additionally, a compressed tarball with all the raw data used to generate those Jupyter notebooks can be found here.

07 Sep 2017 » Using Python and Jupyter for Performance Testing and Analysis

Introduction I recently worked on some changes to the OpenZFS ZIL (see here), and in the context of working on that project, I discovered some new tools that helped me run my performance tests and analyze their results. What follows is some notes on the tools that I used, and how I used them. Quick Overview Before I dive into the details of how I used these tools, I wanted to quickly go over what the tools were:

05 Sep 2017 » Building and Using "crash" on Ubuntu 16.04

Introduction I’ve been working on the ZFS on Linux project recently, and had a need to use crash on the Ubuntu 16.04 based VM I was using. The following is some notes regarding the steps I had to take, in order to build, install, and ultimately run the utility against the “live” system. Build and Install “crash” First, I had to install the build dependencies: $ sudo apt-get install -y \ git build-essential libncurses5-dev zlib1g-dev bison Then I could checkout the source code, build, and install:

28 Aug 2017 » Using BCC's "trace" Instead of "printk"

Introduction Recently I’ve been working on porting some changes that I made to the OpenZFS ZIL over to the ZFS on Linux codebase; see here for the OpenZFS pull request, and here for the ZFS on Linux pull request. In my initial port, I was running into a problem where the automated tests would trigger a “hang” as a result of the readmmap program calling msync: $ pstree -p 2337 test-runner.

04 Aug 2017 » OpenZFS: Isolating ZIL Disk Activity

I recently completed a project to improve the performance of the OpenZFS ZIL (see here for more details); i.e. improving the performance of synchronous activity on OpenZFS, such as writes using the O_SYNC flag. As part of that work, I had to run some performance testing and benchmarking of my code changes (and the system as a whole), to ensure the system was behaving as I expected. Early on in my benchmarking exercises, I became confused by the data that I was gathering.

16 Mar 2017 » Running `sshd` on Windows using Cygwin

Introduction As part of our effort to support Delphix in the Azure cloud environment, we’re writing some automation to convert our .iso install media into a VHD image, leveraging Jenkins and Packer in the process. Essentially, we want to use a Windows server as a Jenkins “slave”, and run Packer from within a Jenkins job that will run on that Windows system. In order to do that, the Jenkins “master” needs to connect with the Windows system, such that it can configure the system to act as a Jenkins slave.

15 Mar 2017 » OpenZFS: Notes on ZIL Transactions

Introduction The OpenZFS Intent Log (ZIL) is used to ensure POSIX compliance of certain system calls (that modify the state of a ZFS dataset), and protect against data loss in the face of failure scenarios such as: an operating system crash, power loss, etc. Specifically, it’s used as a performance optimization so that applications can be assured that their given system call, and any “user data” associated with it, will not be “lost”, without having to wait for an entire transaction group (TXG) to be synced out (which can take on the order of seconds, on a moderately loaded system).

22 Feb 2017 » OpenZFS: Refresher on `zpool reguid` Using Examples

Introduction The zpool reguid command can be used to regenerate the GUID for an OpenZFS pool, which is useful when using device level copies to generate multiple pools all with the same contents. Example using File VDEVs As a contrived example, lets create a zpool backed by a single file vdev: # mkdir /tmp/tank1 # mkfile -n 256m /tmp/tank1/vdev # zpool create tank1 /tmp/tank1/vdev # zpool list tank1 NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT tank1 240M 78K 240M - 1% 0% 1.

06 Feb 2017 » Creating a Custom Amazon EC2 AMI from ISO (using OI Hipster)

Preface In this post, I’ll pick up from where I left off last time, and demonstrate one potential way to convert the installation ISO media generated in that post, into an AMI that can be used to create new VMs in the Amazon EC2 environment. It’s important to note a couple things before we start: While I’ll be generating an AMI based on OI Hipster, this process should be applicable to any Linux or FreeBSD based operating system as well (and quite possibly Windows too, but I don’t know much about that platform).

01 Feb 2017 » Creating Custom Installation Media for OI Hipster

Preface This post is a write up of my notes for creating custom installation media for OpenIndiana Hipster, using a custom/patched version of illumos. It assumes that OI Hipster has already been installed on a machine (e.g. installed on a VM using their provided installation media); and this server will be used to build our custom version of illumos, as well as the custom OI installation media. The goal of this exercise is to create a “Live DVD” that can be used to install our custom version of illumos.

22 Mar 2016 » Effective Communication of Code Changes (slides)

Agenda 1. What’s the problem? 2. Story Time 3. My Proposed Solution .footnote[*Press “p” for presenter’s notes, and “c” for split view.] ??? Agenda Give a brief introduction of the topics that will be discussed. Code is our Enemy > Code is bad. It rots. It requires periodic maintenance. It has bugs that > need to be found. New features mean old code has to be adapted.

07 Dec 2015 » OpenZFS: Artificial Disk Latency Using zinject

About a year ago I had the opportunity to work on a small extension to the OpenZFS zinject command with colleagues Matt Ahrens and Frank Salzmann, during one of our Delphix engineering wide hackathon events. Now that it’s in the process of landing in the upstream OpenZFS repository, I wanted to take a minute to show it off. To describe the new functionality, I’ll defer to the help message:

23 Mar 2015 » OpenZFS: Reducing ARC Lock Contention

tl;dr; Cached random read performance of 8K blocks was improved by 225% by reducing internal lock contention within the OpenZFS ARC on illumos. Introduction Locks are a pain. Even worse, is a single lock serializing all of the page cache accesses for a filesystem. While that’s not quite the situation the OpenZFS ARC was in earlier this year, it was pretty close. For those unfamiliar, the OpenZFS file system is built atop its very own page cache based on the paper by Nimrod Megiddo and Dharmendra S.

06 Jan 2015 » OpenZFS Developer Summit 2014: OpenZFS on illumos

The OpenZFS project is growing! The second annual OpenZFS developer summit concluded just under two months ago, and overall I thought it went very well. There was roughly 70 attendees, twice as many as the previous year, and the talks given were very engaging and interesting. I gave a short talk about ZFS on the illumos platform, and also touched briefly on some of my subjective opinions coming from a ZFS on Linux background.

10 Nov 2014 » OpenZFS on illumos (slides)

Where OpenZFS Originated 2001 – Started at Sun 2005 – Released through OpenSolaris 2010 – illumos spawned; fork of OpenSolaris 2013 – OpenZFS created OpenZFS’s “home” is in illumos: Due to history, but also OS integration: grub, mdd, fma, etc But OpenZFS is growing beyond illumos Development model on illumos Committer access is granted to “advocates” Advocates rely on “reviewers” to verify changes