Good bye LastPass

I’ve been using LastPass for 7 years or more and I have converted various business and people to use LastPass but that ended today. I’m a security minded person so I’m not abandoning passwords managers, just LastPass.

A few days ago I woke up and my LastPass vault looked like this:

That looks bad… I contacted support which took about a day to tell me “Oh, yeah, your Vault is corrupted, don’t worry about it, we logged you out and fixed it. Just log back in.”

Pfiuuuu! I thought. That was a nerve wracking unproductive day, but at least LastPass support eventually showed up and fixed it…. NOPE!!!! It was still corrupt. I sent about 30 more emails to LastPass over a period of 4 days after that and I got no response.

I did manage to revert my vault back to the last master password I used, which thankfully wasn’t changed so long ago so I didn’t lose that many passwords and also thankfully I still remember that master password. But this is completely unacceptable behavior from LastPass to a paying user, so that’s it, I’m done with LastPass.

Any recommendations on password managers?

How I started documenting at a new company

I once got a job at a company that was acquiring a massive piece of intellectual property from another and it was my task to build a team to maintain and transition the knowledge as well as running assets (servers, databases, etc). The company that hired me had no relevant the documentation and the IP we were buying came with very little and all potentially obsolete. Everything was a matter of asking random people one after the other piecing together the puzzle until the answer was built. Oh… and I had 6 months to get it done.

The process of starting to write documentation was very daunting, so I created a new type of document that I named “Exploration”. An exploration is frozen in time and describes something that happened, was found, discovered, figured out, etc. For example an exploration might say: “My boss asked me to add an account, I didn’t know we had accounts. I asked Johnny and he said Sally used to do it but she left. I asked my boss who replaced Sally and I was told to talk to Sam. Sam told me how he adds accounts, the steps are 1, 2, 3, 4. He doesn’t know what step 3 does but he knows that if you don’t do it, the account doesn’t work.”

An exploration is frozen in time and describes something that happened, was found, discovered, figured out, etc.

I bet I’m not the first one to come up with this concept, but I haven’t seen it anywhere. Have you?

The goal of the explorations is to quickly form a written corpus of documentation about “How do we do things here?”. This allows you to depend less on people, which is very useful during transitions or turbulent periods where people might leave.

One of the key aspects of explorations is that they are narrative, informal, and not necessarily high quality. It’s hard to write the manual so people don’t do it. Specially if it’s a big manual of which nothing is written. The exploration is a brain dump.

Explorations should be stored in a centralized documentation system that’s easily searchable. My preference is Confluence (maybe using the blogging feature), the cool kids are using Notion these days. Stay away from Google Docs, because it promotes private copies and there’s no way of having a single tree or directory of documentation. Being able to easily search all explorations is very important and they should be searchable with the rest of the documentation.

Explorations sometimes evolve into proper documentation, procedures that are maintained and live. When that happens, I often have a See Also section in the document that points to the explorations that influenced it and at the top of the explorations I add links to the document. It’s important to note that explorations are tier 2 documentation and it’s important that everybody knows it so that they are not taken as truth when read and are written liberally.

The frequency at which explorations are created changes depending on what’s going on at the company, but generally people should be writing them any time they faced something puzzling. The way I do it is very simple: I constantly debrief with my team about what they did and after they told me the story of what happened, what they did, the workarounds and we have a good laugh, I almost always say: “Please, write an exploration about it”. It takes some effort to get started, but I think often people realize not having to remember things and just making brain dumps has a lot of value. Eventually they would just write explorations without me asking.

Don’t expect explorations to have an immediate effect. It takes time to build a corpus of data that is worth searching for answers. It might take you a year to get there.

Distributed companies vs time zones

Time zones is the arch rival of distributed companies (or rather, the earth being round, but I digress into the meaning of time zones).

When you run a distributed company and you hire people, you might be tempted to hire from all over the world but there’s a problem. If you hire someone that lives 12 time zones away from you, you’ll barely interact with that person and suddenly the company moves at a day-cadence, instead of immediate cadence.

The second problem is that those people in the other side of the world will develop their own culture, their own style, an us-vs-them mentality in which you, the company, is them. All unwanted traits.

My advice, when getting started, is not to hire anyone more than 4 time zones away, so, you can end up with half a day of work of overlap. Only after you have very trusted and senior managers 4 time zones away, you can hire people 8 time zones away that have 4 hour overlaps with those managers, and repeat until you have all time zones covered by trusted senior managers that can carry the culture of your company.

This means that you are unlikely to go fully global until you are at least 50 and even then that’s pushing it. I’d be uncomfortable with a fully global company that’s smaller than 200 people.

How I’m testing seed data generation

When I create a new Rails project I like to have a robust seeds that can be used to quickly bootstrap development, testing and staging environments to interact with the application. I think this is critical for development speed.

If a developer creates a feature to, for example, connect two records together, you just want them to fire up the application and connect two records to see it work. You don’t want them spending time creating the records because that’s a waste of time, but also, because all developers end up having different subsets of testing data and generally ignoring everything that’s not included in them. It’s better to grow a set of testing data that’s good and complete.

One of the problem I run into is that generating testing data, or sample data, doesn’t happen often enough and changes in the application often break it. Because of that, I wrote this simple test:

RSpec.describe "rake db:seed" do
  it "runs" do
    Imok::Application.load_tasks if Rake::Task.tasks.none? { |t| t.name == "db:seed" }
    ENV["verbose"] = "false"
    Rake::Task["db:seed"].invoke
  end
end

It doesn’t have any assertions, but with just a few lines of code it probably covers 90% of seed data creation problems, that generally result in a crash. I would advice against having assertions here, as they may cost more time than the time they’ll save because sample data evolves a lot and it’s not production critical.

The startup CTO dilemma

About 10 years ago I took my first job as CTO but I wasn’t a CTO, I just had the title. I was a developer with ambition. I made mistakes, very expensive mistakes, mistakes that contributed to the failure of the startup. Since then I have learned and grown a lot and although there’s still a lot for me to learn, there are some things I understand reasonably well. One of those is how to be the CTO of an early and not so early stage startup.

With this experience, though, my salary went up. I’m more expensive now than I was 10 years ago and I didn’t know what I was doing. Because of this, I tend to evaluate working for a startup not on day 1, but on day 700 or later, when they have some traction, revenue, etc. The problem is that a lot of those startups are deep in problems that are very hard or impossible to fix by that point. It’s very painful for me to see expenses that cost hundreds of thousands of dollars because someone didn’t do 30 minutes of work 5 years ago (this is a real example).

So, the dilemma is this:

  • If a startup hires an experienced CTO from day 1, they are wasting money because they might only be spending 5% or 10% CTOing and the rest coding, doing IT, etc. which can be done by a less experienced developer.
  • If a startup doesn’t hire an experienced CTO from day 1, they are likely to make very expensive mistakes that may literally kill the startup in year 3 or it may slow it down a lot.

 I’ve been thinking about this for a while, how can this be solved?

One of my ideas was being a sort of CTO enhancer, to be the voice of experience for a less experienced co-founding CTO, helping them a few hours a week for a few months up to a couple of years. What do you think? Does this sound valuable? Useful?

I might be thinking a lot about this lately since I’m leaving my current job and searching for the next thing to do.

Nicer printing of Rails models

I like my models to be printed nicely, to make the class of the model as well as the id and other data available, so, when they end up in a log or console, I can now exactly what it is. I’ve been doing this since before Rails 3 and since Rails projects now have an ApplicationRecord class, it’s even easier.

On my global parent for all model classes, ApplicationRecord I add this:

def to_s(extra = nil)
  if extra
    extra = ":#{extra}"
  end
  "#<#{self.class.name}:#{id}#{extra}>"
end

That makes all records automatically print as:

<ModelName:id>

For example:

<User:123>

which makes the record findable.

But also, allows the sub-classes to add a bit of extra information without having to re-specify the whole format. For example, for the User class, it might be:

  def to_s
    super(email)
  end

so that when the user gets printed, it ends up being:

<User:123:sam@example.com>

which I found helps a lot with quickly debugging issues, in production as well as development environments.

Editing Rails 6.0 credentials on Windows

Rails 6 shipped with a very nice feature to keep encrypted credentials on the repo but separate them by environment, so you can have the credentials for development, staging and production, encrypted with different keys, that you keep safe at different levels.

For example, you might give the development key to all developers, but the production keys are kept very secret and only accessible to a small set of devops people.

You edit these credentials by running:

bundle exec rails credentials:edit --environment development

for the development credentials, or

bundle exec rails credentials:edit --environment production

for production ones. You get the idea.

When you run it, if the credentials don’t exist, it generates a key. If they exist, you need to have the keys. After decrypting, it runs your default editor and on Windows, this is the error I was getting:

No $EDITOR to open file in. Assign one like this:

EDITOR="mate --wait" bin/rails credentials:edit

For editors that fork and exit immediately, it's important to pass a wait flag,
otherwise the credentials will be saved immediately with no chance to edit.

It took me a surprisingly long time to figure out how to set the editor on Windows, so, for me and others, I’m documenting it in this post:

$env:EDITOR="notepad"

After that, running the credentials:edit command works and opens Notepad. Not the best editor by far, but for this quick changes, it works. Oh, and I’m using Powershell. I haven’t run cmd in ages.

Converting a Python data into a ReStructured Text table

This probably exist but I couldn’t find it. I wanted to export a bunch of data from a Python/Django application into something a non-coder could understand. The data was not going to be a plain CSV, but a document, with various tables and explanations of what each table is. Because ReStructured Text seems to be the winning format in the Python world I decided to go with that.

Generating the text part was easy and straightforward. The question was how to export tables. I decided to represent tables as lists of dicts and thus, I ended up building this little module:

def dict_to_rst_table(data):
    field_names, column_widths = _get_fields(data)
    with StringIO() as output:
        output.write(_generate_header(field_names, column_widths))
        for row in data:
            output.write(_generate_row(row, field_names, column_widths))
        return output.getvalue()


def _generate_header(field_names, column_widths):
    with StringIO() as output:
        for field_name in field_names:
            output.write(f"+-{'-' * column_widths[field_name]}-")
        output.write("+\n")
        for field_name in field_names:
            output.write(
                f"| {field_name} {' ' * (column_widths[field_name] - len(field_name))}"
            )
        output.write("|\n")
        for field_name in field_names:
            output.write(f"+={'=' * column_widths[field_name]}=")
        output.write("+\n")
        return output.getvalue()


def _generate_row(row, field_names, column_widths):
    with StringIO() as output:
        for field_name in field_names:
            output.write(
                f"| {row[field_name]}{' ' * (column_widths[field_name] - len(str(row[field_name])))} "
            )
        output.write("|\n")
        for field_name in field_names:
            output.write(f"+-{'-' * column_widths[field_name]}-")
        output.write("+\n")
        return output.getvalue()


def _get_fields(data):
    field_names = []
    column_widths = defaultdict(lambda: 0)
    for row in data:
        for field_name in row:
            if field_name not in field_names:
                field_names.append(field_name)
            column_widths[field_name] = max(
                column_widths[field_name], len(field_name), len(str(row[field_name]))
            )
return field_names, column_widths

It’s straightforward and simple. It currently cannot deal very well with cases in which dicts have different set of columns.

Should this be turned into a reusable library?

WordPress.com new editor handles titles right

I like my text properly formatted and with pretty much every CMS editor out there I always have some confusion when it comes to titles. The post or page has a title and then sections inside it also have titles and they are second level titles. On most CMS you have the option of titles or headers starting at level 1 through 6 (that maps to h1, h2, through h6 in HTML).

For example, this is Confluence, a tool that I really like:

The confusion that I get here is whether a subsection to this page should have Heading 1 or Heading 2. Sometimes Heading 1 will be displayed with the same font, size, etc as the title of the page, so, by using Heading 1 you are almost creating two pages in one. But in other CMS, Heading 1 is the top level section heading and the title of the page is a special title that will always sit above it.

The new WordPress.com editor does this correctly by being very clear that your first option for section headers, after setting the title of the page or post, is h2:

I think that was a very neat solution to the problem. Bravo WordPress.

Turning a list of dicts into a ReStructured Text table

I recently found myself having to prepare a report of some mortgage calculations so that non-technical domain experts could read it, evaluate it, and tell me whether my math and the way I was using certain APIs was correct.

Since I’m using Python, I decided to go as native as possible and make my little script generate a ReStructured Text file that I would then convert into HTML, PDFs, whatever. The result of certain calculations ended up looking like a data table expressed as list of dicts all with the same keys. I wrote a function that would turn that list of dicts into the appropriately formatted ReStructured Text.

For example, given this data:

creators = [{"name": "Guido van Rossum", "language": "Python"}, 
            {"name": "Alan Kay", "language": "Smalltalk"},
            {"name": "John McCarthy", "language": "Lisp"}]

when you call it with:

dict_to_rst_table(creators)

it produces:

+------------------+-----------+
| name             | language  |
+==================+===========+
| Guido van Rossum | Python    |
+------------------+-----------+
| Alan Kay         | Smalltalk |
+------------------+-----------+
| John McCarthy    | Lisp      |
+------------------+-----------+

The full code for this is:

from collections import defaultdict

from io import StringIO


def dict_to_rst_table(data):
    field_names, column_widths = _get_fields(data)
    with StringIO() as output:
        output.write(_generate_header(field_names, column_widths))
        for row in data:
            output.write(_generate_row(row, field_names, column_widths))
        return output.getvalue()


def _generate_header(field_names, column_widths):
    with StringIO() as output:
        for field_name in field_names:
            output.write(f"+-{'-' * column_widths[field_name]}-")
        output.write("+\n")
        for field_name in field_names:
            output.write(f"| {field_name} {' ' * (column_widths[field_name] - len(field_name))}")
        output.write("|\n")
        for field_name in field_names:
            output.write(f"+={'=' * column_widths[field_name]}=")
        output.write("+\n")
        return output.getvalue()


def _generate_row(row, field_names, column_widths):
    with StringIO() as output:
        for field_name in field_names:
            output.write(f"| {row[field_name]}{' ' * (column_widths[field_name] - len(str(row[field_name])))} ")
        output.write("|\n")
        for field_name in field_names:
            output.write(f"+-{'-' * column_widths[field_name]}-")
        output.write("+\n")
        return output.getvalue()


def _get_fields(data):
    field_names = []
    column_widths = defaultdict(lambda: 0)
    for row in data:
        for field_name in row:
            if field_name not in field_names:
                field_names.append(field_name)
            column_widths[field_name] = max(column_widths[field_name], len(field_name), len(str(row[field_name])))
    return field_names, column_widths

Feel free to use it as you see fit, and if you’d like this to be a nicely tested reusable pip package, let me know and I’ll turn it to one. One thing that I would need to add is making it more robust to malformed data and handle more cases of data that looks differently.

If I turn it into a pip package, it would be released from Eligible, as I wrote this code while working there and we are happy to contribute to open source.