Friday, December 18, 2015

Git cheat sheet

Creating a feature branch

Creating a feature branch is a good way to keep your testing out of the main branch. You can hack away at your code without disturbing the general workflow.
The following command will create a feature branch that tracks master. Pushing it with the -u origin flag will autmatically make your local version of the branch track the remote version:
     $ git checkout master && git pull
     $ git checkout -t -b "some-feature-branch-name"
     $ git push -u origin "some-feature-branch-name"
Also, since this is a feature branch you're free to do force commits and cleanup your git history.
You can see what your branches are tracking and what their origin is with the following command:
git branch -vv
  1432-rb-force-safety-switch 62d5af7 [origin/1432-rb-force-safety-switch] Refuse rb if path contains keys.
* develop                     5f56434 [origin/develop: ahead 233, behind 2] Merge branch 'release-1.8.12'
And to see where "origin" points to:
$ git remote -v
origin  git@github.com:stephen-mw/aws-cli.git (fetch)
origin  git@github.com:stephen-mw/aws-cli.git (push)
upstream    git@github.com:aws/aws-cli.git (fetch)
upstream    git@github.com:aws/aws-cli.git (push)
In the above example, I can push to either "upstream" (which in this case is the official aws repo), or "origin", which is my forked repo.

Creating a Git Tag

A tag is a point-in-time reference to a git branch. Tags are immutable snapshots of the repo that can't be altered once they are created (though they can be deleted and recreated). They are useful for tasks such as deployments or ensuring your work won't be altered by others.
    # This will create a tag called "mysql-1380235429"
    $ git tag -a -m "Deploying mysql update" mysql-`date +%s`
    $ git push --tags
Now the tag can be checked out in the usual method.
    $ git checkout mysql-1380235429

Preparing Your Branch for a Pull Request

A pull request should be done before anything is merged into the master branch.
Before asking someone to review your work, you should take a few minutes and make sure it's really finished. You'd be surprised what things can slip through.

Diff Against Master

The first thing you want to do is diff your branch against the master branch. This allows you to catch simple bugs before pushing anything upstream.
git diff master
This will tell you exactly what's different between your branch and the master branch. It will also highlight things such as trailing whitespace.

Rebasing your branch against master (or a different branch)

You can think of rebasing as doing the following actions: 1. Take all recent changes and stash them away 2. Pull down the most recent changes from a different branch 3. Attempt to apply your stashed changes on top of the new changes one at a time.
Here's how to rebase your branch against master:
git checkout master
git pull
git checkout my_feature_branch
git rebase master
git push origin +my_feature_branch
Many times there's conflicts in this process if there were multiple changes to the same file. The way to fix a conflict is to change it during the rebase and add it again:
(fix the conflicts within the file)
git add some_conflicted_file
git rebase --continue
If you think there's a problem then you can abort the rebase with --abort.

Squashing Commits

It's important that you branch is tidy because it makes rolling back bad changes a lot easier. You can squash all of your commits into one or two commits using the rebase command.
First, run git log and find the commit that you want to squash into. Usually this is the first commit on your feature branch.
Next, interactively rebase against that commit:
git rebase --interactive shdf8032hfohsdofhsdohf80h^
This will rebase every commit after the sha.
Follow the instructions for rebasing. Usually you just want to change the "pick" to a "squash" or "s". In the below example, all commits will be squashed into the top commit (which is the oldest):
pick e775ebe Refactor regex to avoid duplication
s 3bb23fd Add support for spaces within unquoted values
s e36d71a Support '-' char in a key name
s 16eda81 Remove unused import in test module
s 403c7ec Add bugfixes to changelog
You will have a chance to rewrite the commit message to better include all of the changes.
After you're finished, do a diff against master again just to make sure things don't go wrong. Then force push your branch to origin.
# Make sure it still looks good
git diff master
git push origin +my_feature_branch

Send Out the PR

Go to github.com and find your branch. Compare it with master and send out the PR. You should ask someone specifically to review your changes and include the following information:
  • Why did you change this?
  • What have you changed?
  • How have you tested these changes?
  • What are the risks involved with these changes?

Oh no! I rebase my branch and lost a lot of history! I'm doomed!

Fear not. With git nothing is ever truly lost. If you made some mistake and you're currently in the rebase process then you can abort it with git rebase --abort. If you've already commited, pushed, etc, then you can use the fantastic git reflog tool to go back to a different commit.

Using git reflog

In the following example, I'm going to force my branch back to the moment before I rebased and broke everything:
First, find the commit that came right before your rebase:
git reflog
5f56434 HEAD@{0}: pull upstream master: Fast-forward
ed610bc HEAD@{1}: checkout: moving from 1432-rb-force-safety-switch to develop
62d5af7 HEAD@{2}: rebase -i (finish): returning to refs/heads/1432-rb-force-safety-switch
62d5af7 HEAD@{3}: rebase -i (squash): Refuse rb if path contains keys.
781b4da HEAD@{4}: rebase -i (start): checkout 781b4da0dcc65736297464dd73da442daad4cf2c^
4350e25 HEAD@{5}: commit: Use two vars for readability. <---- ding ding ding
781b4da HEAD@{6}: checkout: moving from ed610bc9d38244feeaf0b640781da8ab01808f4e to 1432-rb-force-safety-switch
Next, we'll checkout that sha and then force push it
git checkout 4350e25
git log # make sure it's what you want
git push +some_branch

Saturday, December 12, 2015

Using ssh-import-id to manage authorized keys

ssh-import-id

While poking around in my ~/.ssh directory (in order to inspect and harden some of my SSH configurations -- more on that later), I noticed a file that I have never seen before:
ssh_import_id
I was surprised to this this file, especially in a directory related to openssh. Opening the file I saw this:
{
 "_comment_": "This file is JSON syntax and will be loaded by ssh-import-id to obtain the URL string, which defaults to launchpad.net.  The following URL *must* be an https address with a valid, signed certificate!!!  %s is the variable that will be filled by the ssh-import-id utility.",
 "URL": "https://launchpad.net/~%s/+sshkeys"
}
ssh-import-id is a utility included with Ubuntu 14.04+ that, according to the man page "will securely contact a public key server and retrieve one or more user's public keys". In other words it's a way to manage your authorized_keys file via an external API.
You have two options: launchpad.net's user directory or github. Running the utility will fetch and update the authorized_keys file based on the remote API.
For example, the following command will pull down my authorized_keys on github and update the file/home/stephen/.ssh/authorized_keys (since that's the user running the command)
stephen@cato:/etc/ssh$ ssh-import-id gh:stephen-mw
2015-11-30 21:44:04,813 INFO Already authorized ['4096', 'SHA256:3bLv3IXbSzhQpCnchqQprIRHXWPoI+PPW4xwguR6ktE', 'stephen-mw@github/10248951', '(RSA)']
2015-11-30 21:44:04,817 INFO Already authorized ['4096', 'SHA256:5ZtG8hD7l9+yU7I1S17FunmrPR5u6tEcRi0xa6wQGD4', 'stephen-mw@github/12837805', '(RSA)']
2015-11-30 21:44:04,817 INFO [2] SSH keys [Authorized]
The way it works is pretty simple. Github exposes an API for authorized keys. The utility simply makes a request to this endpoint and loads the output into the file. The utility is smart enough to know when keys change (that is, if you added all of your keys with ssh-import-id) and will keep things up-to-date.
By the way, did you know that github has an API for retrieving any public key? If that weirds you out, remember that they're called public keys for a reason! Here's Linus Torvalds public key. It's a 2048 RSA key.
You can add something like this to your crontab to update your key once a day at 4 am, and then once again if ever there's a restart. The second option is to ensure that servers/hosts that have been turned off for a long time can be accessed immediately.
# Pull down my github keys and add them to my user
0 4 * * * ssh-import-id gh:stephen-mw
@reboot ssh-import-id gh:stephen-mw
I find this to be especially useful on small embedded computers, such as a raspberry pi. When the raspberry pi is started after a long period it will automatically pick up my newest keys.

Security

My first problem was a file appearing magically in my ~/.ssh/ directory. I consider this directory a sacred place and don't like uninvited files here. Apart from that, the application bills itself as "secure" so I took a look at the source. Mostly it looks fine, but there are some things I would like to see different:
  • Github usernames can change and that string is the only thing used to pull down the key. If you change your name you'll need to hunt down any instance of this program and update it. That's annoying with embedded systems, which is exactly the problem I'm hoping to solve with this application.
  • For SSL, the application uses Python's urlib and attempts to fallback on shelling to wget. However, there's no guarantee that wget will honor https requests only. In fact this can be disabled via ~/.wgetrc. They're relying on wget's default behavior without being explicit.
  • It checks only if the SSL cert is valid, but doesn't try very hard to see how valid it is. I would have preferred to see it reject any TLS versions lower than 1.2 and only accept EV certificates, since both domains use EV and TLS 1.2.
The last issue worries me the most. SuperfishCNNIC, and eDellRoot all show that rogue certificate authorities are a real and not theoretical problem.
But like most things in the world, it's a trade-off. If you find the convenience outweighs the security risk -- and I do -- then give ssh-import-id a try.

Wednesday, November 18, 2015

List all keys in an S3 bucket using golang

Here's a golang snippet that will list all of the keys under your S3 bucket using the official aws-sdk-go:
package main

import (
    "fmt"
    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/s3"
)

func main() {
    svc := s3.New(session.New(), &aws.Config{Region: aws.String("us-east-1")})

    params := &s3.ListObjectsInput{
        Bucket: aws.String("bucket"),
    }

    resp, _ := svc.ListObjects(params)
    for _, key := range resp.Contents {
        fmt.Println(*key.Key)
    }
}

Sunday, November 15, 2015

What's the difference between a PRNG and a CSPRNG?

PRNG vs CSPRNG

What's the different between a pseudo-random number generator and a cryptographically-secure pseudo-random number generator? A recommendation from the NIST.
The NIST ran various tests (defined in a publication known as SP800-22) to determine which PRNG algorithms produced the best output given a certain criteria.
The result and recommendation for these tests can be read in another publication called SP800-90. Skip down to Appendix C (Informative) DRBG Mechanism Selection to see the good stuff.
Three algorithms are determined to be cryptographically-secure pseudo-random number generators:
* Hash_DRBG
* HMAC_DRBG
* CTR_DRBG
So there you go. Those three algorithms are CSPRNG because they've been tested and recommended by the NIST*.
*There are other organizations and tests that determine a PRNG to be a CSPRNG, but the NIST is the 800-pound Gorilla.

Controversy

It's worth noting that the original publication actually had 4 CSPRNG recommendations. The now infamous Dual_EC_DRBG was quietly removed in the revised publication SP800-9Ar1.
There has been a lot of controversy[1][2] surrounding Dual_EC_DRBG. Whether or not any of the controversy is true, it's clear that trust in the NIST's authority has been undermined by its recommendation of Dual_EC_DRBG. I'm grateful that the NIST has removed the algorithm from its newest revision.
This also serves as a warning to anyone trying to influence these types of publications. The people that design, implement and critique this type of cryptography are smart. Very smart. Smarter than you and me -- and they don't like shenanigans!




Monday, July 20, 2015

Try patch instead of sed in shell scripts

A script is better seen than sed

I'd like to offer all of the shell scripters out there an alternative to sed replacements: patch.
Patch has several important benefits:
  • Context for all changes
  • Easy backup via the -b flag
  • Easy rollback
  • Easy multi-line replacements
Sed is great but can often times lead to enigmatic behavior. This is especially true when using wildcard expressions or other shell magic. And as the saying goes, code is read more often than it is written, so you should write for readability.
Here's a common type of sed replacement you'll find in any old off-the-shelf shell script:
# Replace some mystery var with this one
sed -i 's/setting_foo: */setting_foo: 5'
This kind of thing makes sense to the author, but the reader is at a disadvantage. What is the original value of setting_foo? What type of setting is it and why change it? These are things that are lost in translation with these commands.
And it can easily and often times be worse:
sed -ie 's/[]\/$*.^|[]/\\&/g' file
Here's a real example of updating a mysql.conf. Since multiple lines potentially start with port, you need to use a regular expression to match the word and whitespace to prevent accidents. Also, it's difficult to maintain the original justification of the file. And who knows what the original values were
# Change the port. Since this argument appears on several lines, we only want to change the first one
sed -ie '0,/^port\s*=\s*/port        = 8000/' /etc/mysql

# Change port using whitespace
sed -i 's/max_allowed_packet             = 16M/max_allowed_packet             = 32M/' /etc/mysql

# Allow more packets
sed -ie 's/^max_allowed_packet\s*=*/max_allowed_packet = 32M/' /etc/mysql.conf
Pretty ugly.
Now compare these same changes with patch
patch -b /etc/mysql.conf <<PATCH
--- mysql.conf_old    2015-07-20 19:53:25.000000000 -0700
+++ mysql.conf        2015-07-20 19:58:28.000000000 -0700
@@ -8,14 +8,14 @@

 [client]

-port                           = 3306
+port                           = 8000
 socket                         = /var/run/mysqld/mysql.sock


 [mysql]

 no_auto_rehash
-max_allowed_packet             = 16M
+max_allowed_packet             = 32M
 prompt                         = '\u@\h [\d]> '
 default_character_set          = utf8                                # Possibly this setting is correct for most recent Linux systems
PATCH

Clean and simple! You can see we've obviously added more lines to the script, but we have context to our changes. We've also used the -b flag to automatically backup the file.

Closing thoughts

Of course this isn't always ideal. A good example is when you don't actually know the value that you're replacing. But nevertheless I think it's a good alternative that should be used when readability is important.

Saturday, February 28, 2015

List running EC2 instances with golang and the aws-sdk-go

Managing multiple AWS accounts can sometimes be tough, even when doing something as simple as matching a private IP address with a hostname.

In the past I used the aws cli tools, but I had to constantly switch both the accounts and the regions when making requests:

# Make it's in the prod-5 account in us-west-1
aws ec2 --profile=prod-5 --region=us-west-1 | grep my_instance

# Okay not that account or region, let's try eu-west-1
aws ec2 --profile=prod-5 --region=eu-west-2 | grep my_instance

Repeat 1x for each account and region

As you can imagine this is extremely time consuming, even when using the CLI tools. I wrote a small tool that will find every single instance you can view using every account available to you (according to your ~/.aws/config). The aggregate results can then be searched.

I choose to use the existing ~/.aws/config file so that it works along side your normal aws cli tools. 

This is a very good use-case for Golang, which has nice concurrency primitives. With five accounts this script (once compiled) will display all of the results in under 1.5 seconds. Not bad.

The result is a space-separated list with some additional values added. It should be easy to find what you're looking for:

$ ./aws_list
i-71930187 prod-manage005 10.0.0.180 t2.micro 207.171.166.22 prod-account
i-71930187 stage-controller7 10.0.0.164 t2.medium 72.21.206.80 staging-account
i-71930187 prod-vpn01 10.0.0.239 m4.large None prod-account
i-71930187 stephen-test-host 10.0.0.216 m3.large 216.58.216.174 test-account
...

The output is a space-separated file and only "running" and "pending" instances are displayed. For a list of filters or instance attributes consult the official documentation.


package main

import (
    "fmt"
    "github.com/awslabs/aws-sdk-go/aws"
    "github.com/awslabs/aws-sdk-go/service/ec2"
    "github.com/vaughan0/go-ini"
    "net/url"
    "os"
    "runtime"
    "strings"
    "sync"
)

func check(e error) {
    if e != nil {
        panic(e)
    }
}

/*
printIds accepts an aws credentials file and a region, and prints out all
instances within the region in a format that's acceptable to us. Currently that
format is like this:

  instance_id name private_ip instance_type public_ip account

Any values that aren't available (such as public ip) will be printed out as
"None"

Because the "name" parameter is user-defined, we'll run QueryEscape on it so that
our output stays as a space-separated line.
*/
func printIds(creds aws.CredentialsProvider, account string, region string, wg *sync.WaitGroup) {
    defer wg.Done()

    svc := ec2.New(&aws.Config{
        Credentials: creds,
        Region:      region,
    })

    // Here we create an input that will filter any instances that aren't either
    // of these two states. This is generally what we want
    params := &ec2.DescribeInstancesInput{
        Filters: []*ec2.Filter{
            &ec2.Filter{
                Name: aws.String("instance-state-name"),
                Values: []*string{
                    aws.String("running"),
                    aws.String("pending"),
                },
            },
        },
    }

    // TODO: Actually care if we can't connect to a host
    resp, _ := svc.DescribeInstances(params)
    // if err != nil {
    //      panic(err)
    // }

    // Loop through the instances. They don't always have a name-tag so set it
    // to None if we can't find anything.
    for idx, _ := range resp.Reservations {
        for _, inst := range resp.Reservations[idx].Instances {

            // We need to see if the Name is one of the tags. It's not always
            // present and not required in Ec2.
            name := "None"
            for _, keys := range inst.Tags {
                if *keys.Key == "Name" {
                    name = url.QueryEscape(*keys.Value)
                }
            }

            important_vals := []*string{
                inst.InstanceID,
                &name,
                inst.PrivateIPAddress,
                inst.InstanceType,
                inst.PublicIPAddress,
                &account,
            }

            // Convert any nil value to a printable string in case it doesn't
            // doesn't exist, which is the case with certain values
            output_vals := []string{}
            for _, val := range important_vals {
                if val != nil {
                    output_vals = append(output_vals, *val)
                } else {
                    output_vals = append(output_vals, "None")
                }
            }
            // The values that we care about, in the order we want to print them
            fmt.Println(strings.Join(output_vals, " "))
        }
    }
}

func main() {
    // Go for it!
    runtime.GOMAXPROCS(runtime.NumCPU())

    // Make sure the config file exists
    config := os.Getenv("HOME") + "/.aws/config"
    if _, err := os.Stat(config); os.IsNotExist(err) {
        fmt.Println("No config file found at: %s", config)
        os.Exit(1)
    }

    var wg sync.WaitGroup

    file, err := ini.LoadFile(config)
    check(err)

    for key, values := range file {
        profile := strings.Fields(key)

        // Don't print the default or non-standard profiles
        if len(profile) != 2 {
            continue
        }

        // Where to find this host. The account isn't necessary for the creds
        // but it's something we expose to users when we print
        account := profile[1]
        key := values["aws_access_key_id"]
        pass := values["aws_secret_access_key"]
        creds := aws.Creds(key, pass, "")

        // Gather a list of all available AWS regions. Even though we're gathering
        // all regions, we still must use a region here for api calls.
        svc := ec2.New(&aws.Config{
            Credentials: creds,
            Region:      "us-west-1",
        })

        // Iterate over every single stinking region to get a list of available
        // ec2 instances
        regions, err := svc.DescribeRegions(&ec2.DescribeRegionsInput{})
        check(err)
        for _, region := range regions.Regions {
            wg.Add(1)
            go printIds(creds, account, *region.RegionName, &wg)
        }
    }

    // Allow the goroutines to finish printing
    wg.Wait()
}

Wednesday, February 25, 2015

Generate a unique, strong password on the command line (linux, mac osx)

Find yourself generating a lot of random passwords? Here's a way to generate quick, random, and secure passwords on the command line:
echo $(head -c 64 /dev/urandom | base64) $(date +%s) | shasum | awk '{print $1}'
This command will read 64 bytes of random data from /dev/urandom, base64 encode it, add a small salt (the current data in epoch time), and then create a sha1 hash of the data.
I like this because it's cryptographically secure and the chance of a collision (provided your PNRG isn't totally borked), is infinitely small. It's also a hexadecimal string, so I don't have to worry about quoting it in weird ways or escaping special characters. I can just double-click it in iterm and it's automatically added to my clipboard!

Go ahead and double click the shas below and then click the password from 1password. You'll know what I'm talking about.

The drawback being you can't possibly remember these passwords unless you're US memory champion Nelson Dellis, but you use a password manager anyway, right? Right?!
I do this so frequently that I created an alias, so I just have to type "pw" on the command line to get a random password.
alias pw="echo \$(head -c 64 /dev/urandom | base64) \$(date +%s) | shasum | awk '{print \$1}'"
Now you can create random passwords all day long.
[stephen ~]$ pw
fc2bff4a44cc71b77638185161383592adcf5a6d
[stephen ~]$ pw
172f09a28878eab53df26801564f164209da7b6e
[stephen ~]$ while true; do pw; done
cf8f04bfa23b16dea92b69a9af72a0433e67cb79
28219dc9f626233df6361b44c673505755ac380e
ce14392eeeb408d68a4436586fc05f691c334006
d9c82dd59637ee75d9090195a4633d4b184e6e65
26e6754f480cf039d6b0e131bf079b2a0338b3e2
75376f012bc2ff36c00cb224ac245da719c832ae
b530a231f3a60030db47c077a249857ce4bb2d45
...
# Here's a password from 1password. Go ahead and double click it to add it to your clipboard
Fp9ef>btgMUm%K2AokM(JXV,vkF?CGX9Ry4d78.a