Nerd Alert Issue 18: It’s GitHub’s Birthday!

Github turns 7 years old today!


HOT LINKS

What we're reading this week

  Adam: I don’t agree with everything Eugene Wei has to say about why tech companies are largely beating publishers at their own game but he makes so many good points that I think this take on the recent rumblings about Facebook hosting content directly for publishers is worth reading carefully and digesting.

  Ben: Lego launched a new video game this week, and forgot to buy its domain name. A journalist bought the domain, and now it’s an ethical discussion. What do you think of the journalist’s actions, and the community reactions?

  Denise: Cindy Royal, a Texas State University professor and recent Knight Journalism Fellow at Stanford University, makes a case for coding to be taught across disciplines at colleges. Coding education, she says, will help support the future of journalism (and so many other fields).

  Kaeti: As a distributed team, we do a lot of virtual client meetings. Erika Hall’s article on presenting design remotely is so relevant to my interests. Her core advice can apply to any presentation or meeting: get to know your stakeholders, stay in constant communication, and maintain clear expectations.

  Meredith: My background in literacy attracts me to the idea of programming languages and toolkits for the mind. Language, vocabulary and grammar impact your perspective and your strategies to solve problems.

 Nick: Genius.com, the people behind my favorite hip-hop lyrics site rap.genius.com, are now taking their mastery of annotations and applying it to all kinds of web content. John Resig (the creator of jQuery) annotates the original jQuery code from 2006, and here are notes on Obama’s State of the Union Address. It’s still in beta, but I hope to see this or a spin-off as an open platform for commentary on the web. Annotate all the things!

  Ryan: The INN Nerds have been looking at ways to simplify and streamline documentation of our visual styles in a way that will allow us to get new projects off the ground more quickly in the future. See this post about living style guide tools for a thorough list of tools meant to help lessen the effort required to document your favorite flavor of CSS and keep said documentation up-to-date.

  Will: PDFs are great, except when you scrape data out of them. Jeremy Merrill shares his solution to this problem, using the command-line version of Tabula to build reusable scrapers for automatically generated PDFs. Use them as a model, and liberate your data from PDFs too.

  Bert: One of my cousins is going to space on Monday!


We Made a Thing

Our projects, manifest

A screenshot of the new Current.org homepage, featuring the navigation and two celebrity photos in the featured image spot.

Current.org launched their redesigned site this week! We're happy to see another site using our work.


Shout Out

Work we admire by our journalism peers

To all our friends who participated in #SNDMakes, you’re awesome! We look forward to seeing what you come up with for improving picture stories today.


Some Other Stuff

Gather ye rosebuds

LISTEN: The tick, tick, tick of doom.

COOK: What if you don’t like thick, spongy pancakes? Try this Norwegian recipe.

WATCH: This month's News Nerd Book Club book was Indi Young's Practical Empathy. We were pleasantly surprised when she visited the book club hangout to talk with us. Here's a video of her talk about practical empathy at UX Lausanne 2014.

GIF: From member site Wisconsin Watch’s fracking sand gif collection comes this masterpiece:

A line art animation of a bull, with horns made of sand. The bull's horns flex up and down, rearranging the sand. When the horns are pointed up, the bull snorts, and grains of sand fly out of its nostrils.

Nerd Alert Issue 17: Gooseneck Barnacle Cathedral

Barnacle geese were once thought to develop from goose barnacles.


HOT LINKS

What we're reading this week

  Adam: Communicating securely with colleagues and sources has received a lot of attention in the past couple of years but it’s often still a bit intimidating for non-technical folks to know where to start. This week WNYC’s John Keefe wrote a step by step introduction to setting up email encryption that a provides a gentle introduction to this sort of thing.

  Ben: Most of our readers have probably collided with Git at some point in the past, or had to explain it to someone. The Git Parable, by one of the cofounders of GitHub, is a pretty awesome intro to the version control software that rules our lives today.

  Denise: Time for some inspiration: The annual IRE Award winners were announced this morning. Congrats to the winners (including some INN members)!

  Kaeti: I really enjoyed this Made By interview with Chris Coyier (of CSS-Tricks, CodePen and the Shop Talk podcast). A good reminder that expertise is born of lots and lots of practice, not magic. Plus I’m always happy to see more tech people choosing to work in the midwest.

  Meredith: I prefer Twitter to other social platforms. I am fascinated by the ways that different people use it professionally and personally. Sarah Marshall offers 15 tips for using Twitter for News gathering.

 Nick: In Nerd Alert #14, Kaeti shared an article by Sailor Mercury on “Code Like A Girl”. I almost re-shared it here because I thought it was so awesome but Kaeti saved me from the dreaded re-post. Instead, you should take a look at Sailor Mercury’s kickstarter to make a series of zines on computer science called Bubble Sort. I used to collect the zine 2600, but this is going to be so much better.

  Ryan: If you're new to the command line, it can be daunting, but stick with it. Learn to love it. Powerful tools always have a steep learning curve, but the payoff is usually worth your trouble. Olivier Lacan's post covers the Unix $PATH environment variable, which plays a central role in determining the behavior of the command line. It's a great primer for those that are new to the command line and want to understand the basics, but also has some helpful tips for well-seasoned users.

  Will: With the Apple Watch due in stores later in April, read how Apple secretly built the thing, without any real idea of what they intended to create.

  Bert: Which comes first, the README or the project?


WE MADE A THING

Our projects, manifest

We now offer data consulting, helping you clean data, find stories, and build custom news apps. Last November we launched Power Players, and we look forward to creating something new!


SHOUT OUT

Work we admire by our journalism peers

SRCCON's list of session proposals is automatically updated by a chatbot. Here's the code behind BMO's ability to clean and sort the data, and Erin Kissane explains how it works in this Source post.


SOME OTHER STUFF

Gather ye rosebuds

LISTEN: Uptown Funk is topping the charts. Here Voldemort’s hot take.

COOK: Prepare and eat some Spanish gooseneck shrimp.

WATCH: Romance is in the air!

GIF: The Internet - how does it work?

A gif showing the interconnection between a home computer, an ISP, a central router, a government computer, a satellite ground station, and an undersea cable. The illustration is done in David Macaulay's distinctive style, as seen in books such as Castle, Undergound, and City.

Join Us For The February News Nerd Book Club Hangout

In the Beginning Was the Command Line

The votes have been counted and re-counted, and now we can say with certainty that Neal Stephenson's In the Beginning ... Was the Command Line is the people's choice for the February News Nerds Book Club reading.

This essay was first published online in 1999, and later published as a book in November of that year. You can download a zipped text file from Stephenson's site or buy it wherever older books are sold.

In the Beginning... is about the relationships between users and their computers, and how those relationships have changed over time. It's a 1999 opinion on the state of technology in 1999, leading Neal Stephenson to write just four years later:

In the Beginning was the Command Line is now badly obsolete and probably needs a thorough revision. For the last couple of years I have been a Mac OS X user almost exclusively.
– Neal Stephenson, in a 2003 post on his website and a 2004 Slashdot interview

With that glowing recommendation, we'll let you get started reading. The February meeting will be held via Google Hangout on Wednesday, February 11, at 1 p.m. Eastern time.

The invite and link to RSVP are here (and here is a direct link to the Google hangout for quick reference).

The March book club meeting will be held in person at NICAR. If you'd like to suggest books for that and future meetings, you can do that over on our reading list hackpad.

Happy reading!

January Book Club Recap

Ida Tarbell, as shown in the frontispiece of All in the Day's Work

For January the News Nerd Book Club read Ida Tarbell's autobiography, All in the Day's Work.

Tarbell details her growth from a child in the oil-rich lands of 1850s western Pennsylvania to the muckraker known for her investigations of Standard Oil. Her words take us to Poland Seminary of Poland, Ohio; The Chautauquan in Chautauqua, New York; the streets of 1890s Paris where she researched Madame Roland and wrote for McClure's Magazine. She wrote a series on Bonaparte, then one on Abraham Lincoln, then another on Standard Oil and John D. Rockefeller.

Some of what we discussed:

  • Is true objectivity possible in journalism? Tarbell grew up affected by the oil industry, and targeted its illegal practices, but didn't align herself with the muckrackers of the era and tried to find a balance between different sides of the story.
  • The French citizens Tarbell encountered weren't concerned with life outside the borders of France. Are there are modern parallels?
  • Expatriate writers were in such demand that Tarbell funded her time in Paris with articles for American publishers. Are Americans today actually interested in other countries' events beyond just the story of the moment?
  • Her dedication to her work amazed us. She was truant in grade school until she discovered that schoolwork was a puzzle to be solved.
  • Are all journalists driven by an intense curiosity from childhood? Tarbell's story is but one example. If you interviewed every investigative journalist, would you find that the topics of their investigations were related to their childhood experiences?
  • Tarbell's greatest stories were serialized over months or years in magazines. Is Serial a sign that this format of publishing will return?

Next month!

Our book club hangout next month will be Wednesday, February 11, at 1 p.m. Eastern time.

Help us select the book for February's hangout by filling out this quick survey.

The three titles under consideration are:

The Design of Everyday Things by Donald Norman

The Design of Everyday Things is even more relevant today than it was when first published.
– Tim Brown, CEO, IDEO

Responsive Web Design, second edition by Ethan Marcotte

Day by day, the number of devices, platforms, and browsers that need to work with your site grows. Ethan’s straightforward approach to designing for this complexity represents a fundamental shift in how we’ll build websites for the decade to come.
– Jeffrey Veen, CEO and cofounder of Typekit, VP of Products at Adobe

In the Beginning... Was the Command Line by Neal Stephenson

In the Beginning was the Command Line is now badly obsolete and probably needs a thorough revision. For the last couple of years I have been a Mac OS X user almost exclusively.
– Neal Stephenson, in a 2003 post on his website and a 2004 Slashdot interview

You can also add suggestions to our book club reading list or tweet them to us @newsnerdbooks.


Ida Tarbell on pie:

Of the Monthly I have more distinct recollections. It was in these that I first began to read freely. Many a private picnic did I have with the Monthly under the thorn bushes on the hillside above Oil Creek, a lunch basket at my side. There are still in the family storeroom copies of Harper's Monthly stained with lemon pie dropped when I was too deep into a story to be careful.

Party Time In A Widget

The Maine Center for Public Interest Reporting launched its Political Party Time series this week. It's powered in part by the Sunlight Foundation's Political Party Time database, which I'll call the PPT.

PPT is a giant database of political fundraisers, filled with information from invitations sent to Sunlight by readers like you. There are a lot of ways to sort the information,

Screenshot - 09132014 - 09_25_06 PMThe Pine Tree Watchdog Political Party Time series queries the PPT API (what's an API?) to get a list of events in Maine. Sunlight's API  has a number of ways to limit the events returned (by date, host, beneficiary, etc.), but the sidebar widget we wrote for the series' pages simply lists events that benefit Maine politicians.

The widget lists the benefiting politician or politicians, the date, the place, and a name for the event, for the most-recent 50 events. If a politician has been talked about on the Pine Tree site, readers can click to see stories about that politician. If there's a story about that event, it will be listed in the results. They can also click on the event name to go to the PPT site to see more about the event, including the invitation.

The code for the widget is currently contained in the WordPress child theme for the Pine Tree Watchdog site (they use INN's Largo platform and that provides the parent theme), the relevant parts to this post are the widget itself and the custom page template for the PPT series page.

Here's how it works, and what I learned while building it.

How do you query an API?

The simple way to query an API is to find the right URL and hit it with a browser. It'll return some JSON code that looks like this:

Screenshot - 09132014 - 12_49_46 AM

But our readers prefer pretty tables. We do, too, which is why we parse that JSON to get information about the event. Slap some JavaScript into the browser window, load the page and go.

Except it's not that simple. We can't hit the API from the browser because, in addition to requiring an API key to limit access, the Sunlight Foundation APIs deny CORS requests.

CORS is a response to a response to a security threat. In order to prevent Cross-Site Request Forgery attacks, where a malicious website uses your credentials to gain access to another site, browser makers implemented the same-origin policy. Now, code on malicious sites can't access information on good webpages, but neither can good web pages access information on other good web pages.

In situations where both sites want to share information, browser and server developers agreed on the CORS mechanism. When code on a site like Pine Tree Watchdog requests information from the Sunlight Foundation PPT, the browser adds an HTTP header saying that the request originated on the Pine Tree Watchdog site. The Sunlight Foundation API response could include a corresponding HTTP header telling the browser that the Pine Tree Watchdog site is allowed to access the data.

But the response doesn't. Sunlight doesn't have unlimited bandwidth and server space to handle a good, thorough slashdotting. Denying CORS requests cuts down on the amount of information they have to serve.

Since the reader's browser can't retrieve this information, the PTW site does it for them. If the PPT API hasn't been queried in the last 30 minutes, the server will download the API response and save it to disk. Then the site parses that into an HTML table, and saves the table to disk. The HTML table gets placed inside the widget, and the widget is displayed to readers with some CSS that makes the table look like a list.

cache-the-thingsAll this caching happens in order to keep things snappy for readers. By caching the PPT API response, the server doesn't have to hit the API every time it wants to rebuild the list of events. And by caching the event listing HTML, it can insert that in the page instead of laboriously building the listing every single time.

Cross-referencing

The links to the politicians' names is a pretty cool feature. It checks to see if a tag exists in the WordPress database that matches the politician's name. If it does, then their name in the listing is made a link to all stories tagged with the politician.

In a development build, I had it automatically create new tags if the tag didn't exist. This was a little too much information for the readers, so it was removed before the widget launched. The pre-creation of new tags made things easier for the writers, but it gave readers a link to nowhere.

What I learned:

  • CORS feels generally useless
  • file i/o in PHP (for caching)
  • td { display:block; }
  • some ways of handling JSON in PHP
  • WordPress widget construction
  • think about the readers, not just the newsroom users

This was the last project I worked on during my summer internship with INN. Now I'm in Idaho on a 3-month emedia internship for the Progressive Publishing family of magazines.

A Simple Tool To Create Responsive Tables Directly In Your Browser

Note: This tool is not maintained, and has been deprecated since February 2018. It does not handle Google's API limits. 

This note was added December 4, 2018.


We released a responsive table creation command-line tool last month. After installation, it took a Google spreadsheet and some configuration in a JSON file, and returned a collection of files to be uploaded to your webserver and embedded on your page. Sounds complicated, right?

Today we're launching a responsive table creation webapp. Feed it the URL of a Google spreadsheet and it will help you create a collection of files that need to be uploaded to your webserver and embedded on your page.

The difference between today's responsive table tool and last month's is that this one runs in your browser. It offers you a convenient user-friendly interface. It lets you preview what your table will look like. And, most importantly, the tool doesn't require you to install anything on your computer for it to work.

Give it a try!

How the webapp works

Once you've set up a spreadsheet in Google Drive, paste its URL into the URL field of the webapp. The page parses the URL to extract the spreadsheet key (44 characters of upper- and lower-case letters, numbers, and dashes), then feeds the key to Tabletop.js, which parses the spreadsheet. Tabletop.js and jQuery combine to render a list of columns in the webapp, allowing users to give each column a display name. Users can also add their Google Analytics ID to be included in the download.

The preview uses the same Tabletop.js + Tablesaw combination that the downloadable files use, but it gets the spreadsheet key and the column definition from URL parameters.

Other notes

The webapp doesn't change the actual responsive table code generated by the original app. The table breakpoint is still set to 60em/960px, and works best for tables with 5-7 columns.

You can see an example of a table rendered with our rig here: http://nerds.inn.org/wp-content/uploads/static/discounts/

And an example of that same table embedded via iframe: http://nerds.inn.org/discounts/

View the code in the INN/responsive-tables repository on GitHub (bug reports and pull requests welcome!): https://github.com/INN/responsive-tables/

How To Use GitHub Wikis For Collaborative Documentation

This week I've been updating the documentation for INN's Largo WordPress theme. The documentation previously lived on the project website but we're now adding a wiki attached to the INN/Largo repository on GitHub.

In this post I'll use the Largo wiki as an example to show you how to setup a Github wiki for your project.

Setup

GitHub wikis are themselves git repositories.1 This means you can keep track of changes to the wiki and accept issues and pull requests to a repository made to contain the wiki.

To allow contributors outside of our team to submit pull requests and issues to the wiki, it makes sense to set it up as a separate repository. When the project updates, we'll push updates from the wiki repository to the wiki on the main repository for the project.

For our purposes, the container repository for the wiki is called Largo-docs.

These are the remotes I have for Largo-docs:

$ git remote -v
origin	git@github.com:INN/Largo-docs.git (fetch)
origin	git@github.com:INN/Largo-docs.git (push)
wiki	git@github.com:INN/Largo.wiki.git (fetch)
wiki	git@github.com:INN/Largo.wiki.git (push)

The first remote there, origin, is the container repository for the wiki content (Largo-docs). The second is the wiki that's attached to the main Largo theme repo.

On GitHub, the URIs for project wikis insert .wiki between projectname and .git.

GitHub project wikis are ‘magical’ repositories that are associated with your project, but are not part of the project repository. When you push to them, the wiki is rebuilt with the new content automatically. If you use GitHub pages, it’s very similar.

Since Largo-docs is just another GitHub repository, you can provide users a link to the repository’s issues page, where they can give you suggestions and pull requests. If you want to allow people to contribute to the wiki without allowing people to directly edit the wiki, this is the way to do it.

When it comes time to publish new content to the wiki associated with the main theme repository, I make sure I'm on the master branch of the Largo-docs repository and then I push to the remote I set up for the wiki on the main Largo repository.

It's as simple as:

git checkout origin/master
git push -u wiki master

Organizing the Wiki Repository

Keeping the wiki in a separate repository also makes organization easier.

By default, GitHub wikis are flat, with no hierarchy. Inside a repository, you can create folders and place individual wiki pages in them.

screenshot

These pages are simply separate files, where the filename becomes the page name. For example, example-filename_here.md becomes “example filename_here” - use capitalization and dashes to create proper article titles.

The folder structure isn't reflected when GitHub generates the wiki, but it's great for your own organization.

If you want to structure the wiki, with some posts being parents to others, you can create a table of contents for the wiki and put it in a file named _Sidebar.md. The underscore is important.

screenshot-2

Create the hierarchy with an ordered or unordered list, and use horizontal rules as separators. Within posts, use Gollum syntax to link between pages.

Authoring and Editing

GitHub doesn't use straight Markdown for wikis, though. Wikis can use nine different markup languages for posts, with Gollum syntax used for internal links and images.

Mixing Gollum and Markdown markup leads to some interesting formatting, but nothing deal-breaking. In the container repository, GitHub will only render GitHub-Flavored Markdown (GFM). The Gollum-style inter-page links and image links will be ignored, leading to situations like this in the container repo:

This is what happens when you mix Gollum link markup and Markdown in a page: GitHub only renders the Markdown.
This is what happens when you mix Gollum link markup and Markdown in a page: GitHub only renders the Markdown.

Not pretty, but hey, it works!

If you're looking to document your project in a way that others can contribute to, but you don't necessarily want to open wiki editing to everyone, this is a great way to do it!

And of course you're welcome to contribute and help improve our documentation at https://github.com/INN/Largo-docs.


1: So are GitHub Gists, incidentally. And since May 1, 2014, you can use GFM task lists in wikis.