I'm a self-taught nerd with a passion for learning. I thrive in adversity and stagnate in comfort. I push boundaries. I am a vessel of unbounded tenacity, always ready for the challenges ahead.
Venice was our least favorite place in Italy. It is extremely wet and damp. There are throngs of tourists and very few locals. As a result, there is no local culture. Prices are extremely high, partially because of the tourists and partially because everything has to be boated in. We’re talking $29 plates of pasta and $10 glasses of wine.
The romance of Venice is best found in photos like the ones above, which we worked very hard to frame. Visiting Venice is not advisable, but if so you can get a feel for the place by only staying one night. Anything longer is a travesty. Much more pleasurable experiences can be found in small towns and back streets in other parts of Italy.
An hour outside of Florence, there is a lovely little town called Lucca. Fortified by gigantic walls, it emanates a tranquility that can only come from a place where no invader in history dared attack. Easily the most peaceful place I have ever visited. I look forward to coming back, perhaps staying a night or two. A refreshing pit stop on an otherwise chaotic leg of our trip.
Magnificent views from the Boboli Gardens in Florence. These gardens were essentially the backyard of the Medicis, the powerful merchant family that funded much of the Renaissance.
Coffee in Italy isn’t better than America. Just different.
Where we prefer our full-cup drip coffee, the Italians prefer a more condensed espresso. Typically, the morning is started by heading to a nearby caffé and ordering an espresso (80 cents) or cappuccino (1.20 euro) while standing at the bar.
In Italy, because the coffee is measured in shots, there is no concept of tall, grande, or venti sizes. You can ask for doppio (double) caffé or doppio latte (milk), however.
The little coffee maker shown above makes a brew called Moka. It produces a slightly different flavor, one that I (any many Italians) prefer after lunch or dinner.
American drip coffee is clearly better in the winter. A big cup of hot drip coffee keeps you much more warm than a quick shot of espresso.
You will usually be served in a ceramic cup, unless you specifically ask da porter via (to take away). The Italians assume you’re there to enjoy the experience, not just the product. How classy.
Prêt a Manger is one of my favorite fast food options in NYC. And Chipotle isn’t bad. Here’s to making quality fast food ubiquitous, from London to Rome to New Orleans.
I just got off the plane from London and ate at Pret a Manger at Heathrow. It’s hard to imagine that McDonald’s owned this chain from 2001 to 2008. But you can clearly see they didn’t have the same DNA, especially when McDonald’s “oatmeal” has 21 ingredients, when in fact, oatmeal is an ingredient.
When I was a kid, fast food options were bad and worse. Now it’s Le Pain Quotidien, Pret a Manger, Panera Bread, Chipotle, and the like. Given the capitalist system that is the United States, we’ve got to fight capitalism with capitalism. And, I for one, am super optimistic that healthy capitalism will win.
On Tuesday we were scheduled for a nice relaxing vacation in the gorgeous beach towns of Cinque Terre. We took a train, which was 30 minutes behind schedule, from Florence to Pisa. Then another train to La Spezia, just a short 20 minute hop from our final destination. In La Spezia it was pouring down rain and the station was total chaos. We waited for our final train, but it never seemed to come.
After a few hours of lugging our bags up and down stairs, always anticipating the next train, we finally realized there was a major problem. The information booth was jammed, but some locals informed us that there would be no more trains this night. We called our hotel who informed us that the road was also closed, making bus or taxi rides impossible.
We walked around in the flooded streets for hours trying to find a hotel. When we finally found one with a room available, the service sucked and my girlfriend was locked for 45 minutes in the bathroom. The key didn’t work, and the front desk guy was indifferent about it until I motioned that I was going to kick it down if they didn’t do something fast.
Then a Nun who manages the hotel came up. After messing with it a while, she had her throw the key out the window, where it was caught 4 floors below by the nun. After 20 more minutes of fussing, it finally opened, and the Nun left us in peace.
Today, we woke up and the situation did not change, although we still had no clue why everything was shut down. Disappointed, we went back to Florence. Only then did we discover the true problem: massive flooding in Cinque Terre. Had we arrived a few hours earlier, we would have been stuck right in the middle of all the craziness.
We’re working on a new project at Squidoo that requires extracting as much metadata as possible from any given URL. As part of the project, I’ve created a simple open source PHP library called php-url-meta that crawls a URL and returns an object containing the page title, description, keywords, author, and a thumbnail image.
The library favors Open Graph tags in most cases, falling back to standard meta tags if OG tags are not present. Author data comes from rel=”author” markup. Needless to say, because markup is totally up to the entity who created the URL, there are no guarantees that anything will be available.
Lately I’ve read a lot of articles proclaiming that scalability is no longer an issue thanks primarily to AWS. As a devops engineer who lives and breathes this stuff, I’d like to point out that there are oodles of other technology advances that are more critical for scalability than simply being able to spin up virtual servers on demand.
The simplest possible example of why more servers != scalability is that of a MySQL query. If you run an unindexed query on a large table, you can add more slaves all day long but you still aren’t going to be able to service requests more quickly. Add an index, and suddenly you can service hundreds or thousands of similar queries with the same amount of resources as it took to run a single unindexed query.
I’d argue that the prime enablers of web scalability are:
You might argue that AWS offers many of the services I’ve described above. It’s true. But AWS was not the first to offer them, nor is AWS the only (or even cheapest) option today.
Blekko’s CTO describes the requirements and design of the custom nosql database that powers its search engine. I’ll admit this is a tad bit over my head, but WOW. The author presents very complicated computer science topics in fairly mild language (helps if you’ve spent time with Lisp or map/reduce). Can’t wait for part two.
Back in December I started putting some thought into the tumblr firehose. While the initial launch was covered here, and the business stuff surrounding it was covered by places like techcrunch and AllThingsD, not much has been said about the technical details.
Update: CloudFlare’s CEO’s response at end of post.
CloudFlare is a service that sits in between your web site and its visitors to make pages load faster and defend against malicious users. The first time I heard about CloudFlare, I was enchanted. At Squidoo, I’ve worked for years to develop a rock solid performance and security infrastructure, and all of a sudden a company comes along that offers many of the same features for only $20/mo.
I’ve been eager for the chance to try CloudFlare, and the newly relaunched CollabFinder was the perfect test. Now that I have some experience with it, here is what I love and hate about Cloudflare:
LOVE
HATE
If I could do it all over…
I’d still pick CloudFlare. For small businesses, side projects, and new ventures, it’s simply the easiest and most effective way to speed up and secure a web site.
Have you tried it? What do you think?
Update: CloudFlare CEO Matthew Prince confirms via Twitter that CloudFlare is working on the negatives I’ve mentioned and that they do have support staff working 7 days a week. The tweet I referenced above is apparently directed only at hosting partners, not standard customers. Prince touts an average turnaround time of 3 hours on support requests, although there are no guarantees (and I’m waiting on an initial response to a ticket posted 22 hours ago). CloudFlare plans to offer a complete SLA sometime in the future.
There was a very interesting post recently from Pud, co-founder of Blippy and creator of TinyLetter. Pud is also behind Fandalism, a social network for musicians with 400,000+ users. The question he is asking is “What do with this?”, especially on the monetization side: getting bigger?…
Build a band generator. Using the data and some filters, in one click, Fandalists - if I may - would generate a random short list of musicians (s)he could play with on weekends - my colleague @_timothee was suggesting this to me this morning.
I am a jazz pianist in SF, I would get a drummer and a guitarist who are also passionate jazzmen living in the city. Another click and the website would send a notification to the matched users.
Your band generator could have also some preset and could build any kind of band/event: “Rock band”, “Gipsy guitar trio”, “Piano duel” or even “Jazz Big Band” or “Grand Orchestra”.
Of course, this is doable using each filter manually but the magic is to make instantaneous. Make it obvious, create new connections.
I love the band generator idea. This is something I could see us supporting on CollabFinder. While musicians aren’t allowed on the site just yet, we’ve received quite a few requests and will be opening up soon.
Are you a musician looking for a band? What kind of tools would help you in your search?
CollabFinder is the place to find a developer, designer, or writer for your next big project. It was originally created by my good friend and designer Sahadeva in 2009, but at some point had to be taken down due to a lack of development resources. That’s a shame though because several companies were started by people who found each other through the site.
And so for the past 7 months Saha and I have teamed up along with Simon to create New CollabFinder, a complete redesign using modern tools like HTML5 Boilerplate, pushState, and the Facebook and Github APIs.
Today’s launch wasn’t all that glorious, and I’m quite tired, but I plan on posting more details soon about the product we’ve launched and the lessons we’re learning.
I’m probably the only person who still uses SimpleCassie, an early PHP wrapper for Cassandra. I like its chaining syntax and I’m too lazy to port our code over to phpcassa (although a CQL migration seems inevitable).
Just in case there are other SimpleCassie users out there, I’ve forked it on Github to include TTL support (developed by Zhengjun Chen) and a parse() method to return friendly responses instead of raw Thrift objects. See the README file for details.
thats SCRUM and TDD and all the rest; it is all those new ways of managing development projects and being super-productive and modern and buzzword-compliant; all the sprints, scrums, playing cards crass commercial nonsense.
The management pitch is that by getting programmers to follow some process rote you will get good, predictable results out.
See, the thing is, the success of the coding-part of a project is dependent on the calibre of the engineers doing that coding and not the process they follow.
I’ve always believed that rigid project management is itself a time sink. The answer? Small teams, rapid prototyping, and resistance to scope creep.
Microsoft HTTP S&M? Lulz.
According to Microsoft, SPDY isn’t good enough because it forces compression and encryption, requiring more CPU and power. They’d rather support those features through extensions.
This is typical of Microsoft to make compromises that lead to an inferior product. Why not push the internet forward to more desirable defaults, instead of worrying about legacy bandwidth caps and battery limitations that will soon be overcome?
The overall pattern is fairly clear. It’s meetings and collaborative work during the day, a dinner-time break, more meetings and collaborative work, and then in the later evening more work on my own. I have to say that looking at all this data I am struck by how shockingly regular many aspects of it are. But in general I am happy to see it. For my consistent experience has been that the more routine I can make the basic practical aspects of my life, the more I am able to be energetic—and spontaneous—about intellectual and other things.
Stephen Wolfram (creator of Mathematica and Wolfram Alpha) is somewhat maniacal about keeping The Personal Analytics of My Life. It reads like an OKCupid-blog teardown of a single person.
I generally have the attitude that variety is the spice of life, but his idea that routine enables productivity makes me think that I should embrace it more often.
IF YOU COULD ONLY LEARN ONE PROGRAMMING LANGUAGE, HAXE WOULD BE IT.
That’s a pretty bold statement from a language I’ve never heard of. Where did Haxe come from, and who’s using it (for what)?
I’m working on a new stats feature to illustrate whether numbers are trending up or down. Example:
To do this, I needed to create a PHP function to analyze two numbers and determine the percentage change and trend. Here’s my solution:
There is something really wrong with modern programmers. Very wrong indeed.
Mailinator creator Paul Tyma has a great blog post on how he compresses our email by 90%. He has a simple LRU cache of lines and consecutive lines from emails, so emails become a list of line IDs. He also has a background thread doing rather stronger LZMA compression on large emails. He’s winning.
Thing is, in one of the comments on the Hacker News version of the article, someone wonders why he didn’t use Redis for the LRU.
I think of the kindest, least shouting way to hint that that’s not such a good idea; I say it might be higher latency to block on a Redis LRU than using a in-process data-structure. The commenter responds by saying that Redis is known to be fast; that he have just chosen node.js and Redis for his startup because of its performance.This is wrong on so many levels. So many levels. I don’t single this person out - they are just a useful illustration. Their mindset is endemic in this industry. All around you, the new generation of programmers are making the same assumptions.
So often modern programmers feel like they have to keep up with the Joneses, applying each new bleeding edge technology as soon as it is released on Github.
That’s why Paul Tyma’s post about pursuit of a higher level of programming consciousness through the endless iteration of Mailinator was so refreshing. All that traffic on one box. Bravo, Paul.
There’s no trendier topic to tech journalists. So much that it’s easy to forget that more data does not always lead to more accurate conclusions.
I predict that big data will give way to “smart data,” describing both the smart application of data and the efficiency of retaining only what’s absolutely necessary.
I’ve had great success using Cassandra for real time querying, but have only recently begun exploring more complex reporting queries.
I knew a report I needed recently would be hitting the database pretty hard, so I isolated one of our nodes by removing it from the balancer serving our front end users. I used a ConsistencyLevel of One to make my reads as fast as possible. Unfortunately, I was forgetting about Cassandra’s read repair mechanism.
When a query is performed under ConsistencyLevel of One, the first node to respond will return the result to the client, but all replicas are still contacted in the background. This means that a client connecting to one Cassandra node can still impact performance on another node. With my batch report, I experienced rising CPU and memory usage on our entire cluster, to the point where it was impacting real time queries.
Cassandra’s view is that optimizing for both real time queries and batch reporting on the same server is futile. Instead, the common pattern for supporting both workloads is to use Cassandra’s Rack Aware strategy to create two different clusters (aka “racks”) with the same data but different performance configurations and workloads.
All well and good for those with the luxury to purchase gobs of hardware, but for the rest of us there aren’t many options. It would be great if Cassandra supported a ConsistencyLevel where only one replica is contacted at all. Unfortunately, given the fact that there’s no guarantee the server the client contacts is the one holding the data, this probably is not possible.
Machine Learning was an intimidating phrase to me for a long time. I always assumed it was a dark art accessible only to those with computer science degrees.
Fortunately, in the past couple years there have been an increasing number of tutorials to describe the practical implementation of such algorithms. An excellent example is Joel Grus’ Hacking Hacker News article, which describes how he setup and trained a Bayesian algorithm to recommend upcoming articles that he would like, regardless of their popularity on HN itself.
One newbie mistake I made was focusing too much attention on which algorithm to use for a particular task. It’s true that algorithms like SVM, naive bayes, and neural network work best for certain types of tasks. However, it’s like building a house and obsessing over which type of hammer (ball peen, rubber mallet) to use: if the choice of hammer becomes more important than whether you’re building on a solid foundation, you’re doomed.
Machine learning is all about the care and feeding of the algorithm. The most important questions are (1) what’s your training set, and (2) how are you validating it? If you focus on those two questions, the rest falls into place easily.