The new trend of web 2.0 websites is that they are getting more dynamic in nature. The dynamic effects make the website more usable, give quick response to the user, connect them in real time with data and other users. Improvements upon JavaScript performance and attempts to create easier to use libraries make this task a lot easier. The JavaScript libraries are still at its fledgling state. There is no standardization for these libraries now, not enough technically trained people to be widely used.

We do not see this as a setback working with JavaScript libraries. In the apartments domain, except the recent map based mashups, most sites are largely static 1.0 pages. It is not unusual for a user to submit 3 pages of forms before coming to the results, so that means there is still a lot of room for improvements. Recently popularized web 2.0 applications are simple in nature, self contained, do not attempt to add unnecessary content (e.g. selling unrelated service to users, adding too much ads). You’ve probably stumbled upon them, mint.com, rememberthemilk.com, to name a few. They are self contained application operating on the web, rather than traditional sense of a website. Besides simplicity, I think their major advantage is that they don’t demand too much from users or distract them. Submitting a form is demanding from the user, waiting for page load is demanding from the user, and asking the user to refresh the page is demanding from the user. Giving immediate feedback is best user experience possible.

Building a self contained JavaScript application is not an easy task, but there are many options out there to help you. Starting from simple jquery, prototype to more comprehensive frameworks like extjs, yui, dojo. The simpler the library, the easier it is for you to customize. To brand up a website for your company, I suggest you go for simpler libraries so you can build an exact look and feel for your website. The more complete frameworks have there own predefined layouts, looks and feels of panels, so it can confuse your brand name with other websites built under the same framework.

On the web programming side, there is more demand for trained programmers rather than traditional web wizards who come from a design background. Traditionally, programmers have a conception that JavaScript is a tool web designers use to patch up their website, so many also don’t learn seriously about it . However, nowadays, JavaScript libraries are built upon classes, inheritance etc.. And the whole application is structured by modules and classes. It is no longer a web wizard’s job, but rather a job of traditional GUI designers. Whichever background you are coming from, you need to brush up JavaScript 101 for the web 2.0 application and it requires all the things you’ve learned about programming. The technology is volatile, as long as we keep evolving, we will always be cutting edge.

Paul Yuan

As a vertical search engine, Cazoodle has been doing data processing with Hadoop MapReduce. Like many others, we use MapReduce for crunching large scale data, for analysis tasks such as data annotation and page classification. However, with the large-scale data, some problematic records may fail and then crash the entire MapReduce task. As we don’t necessarily know where these culprits are, we use try/catch block to detect it in general.

Code:
try {
... process data
} catch {
... handle errors, or just skip it.
}

But what if we use native codes thats may crash the whole JVM? For example, when we call a native C library from JNI, there may be null pointer errors in some rare cases. Or, some records may lead to OutOfMemoryException which will then make JVM unpredictable, like the following:

Code:
try {
... process data
// Oops, the JVM crashes sometimes
} catch {
// Sorry, no exception because the JVM is crashed.
}

It looks like caused by a bug in the code; however, sometimes it’s an unavoidable behavior. For example, what If we are using a third-party library and we cannot change it?

So our goal is to skip problematic records when the task is re-run in another Mapper/Reducer. However, the task may be assigned to another machine in the cluster, so using a local log to skip the record is impossible. Here the popular Memcached comes in handy. The code becomes:

Code:
if(memcacheClient.add(String.valueOf(key), new Boolean(true)) == false) {
... log the record and skip it
}
try {
... process data
// Oops, the JVM crashes sometimes
} catch {
// Sorry, no exception because the JVM is crashed.
}
memcacheClient.delete(key);

Whenever a record crashes the JVM, it will not be deleted in the cache. When we re-run the Mapper, it will be logged for further debugging instead of processing it (and crashing the JVM again). The only requirement for this approach is we should have a unique key to store in the cache.

Thanks for the high performance of Memcached, there is almost no overhead. In our experience, one memcache server can handle 300 tasktrackers in a job that parses web pages. The internal bandwidth usage is less than 100 kb/s. Now we can spend more time on developing better service rather than fighting with problematic records. :D

Remark: In Hadoop 0.19.0, there is a new feature to skip records which can not even be read (http://issues.apache.org/jira/browse/HADOOP-153).

York Tsai

Developer Team

The real estate market is clearly feeling the pinch of the bad economy. According to compete.com, web traffic of real-restate companies like Trulia, Zillow has decreased in the month of September by 11.8% and 2.2% respectively. Recently, Zillow decided to lay off 25% of its workforce. Rich Barton, CEO of Zillow.com said in his recent posting titled “Difficult times, Difficult decisions” on October 17, 2008:

This week we are reducing our workforce by 25%. This was an incredibly painful decision for me and the leadership team, but, in the end, we concluded that we had no choice but to securely batten down the hatches as we sail into a major economic storm.

Redfin laid off 10% of their workforce earlier in the month. The current time is clearly proving bad for real-estate companies.

Twelve months ago, many of the CEOs were still optimistic about growth in real-estate. I remember back in November 2007 at the ILM (Interactive Local Media) 2007 conference hosted by the Kelsey group, the CEO of real estate companies like Zillow, HomeThinking, were very optimistic in their outlook on real estate traffic and online revenue. Obviously, they were oblivious of the ongoing trend. They cannot be blamed for their wrong sight. Even the experts at Wall Street are taken by surprise by the current rapid turmoil. It was extremely difficult to predict that we would see the worst economic slump since the Great Depression.

Rental market is faring better than real estate. According to the latest Harvard report on “State of the Nation’s Housing in 2008“,

Rental housing is reasserting its importance in US housing markets. With so much turmoil on the forsale side, many households have reconsidered their financial choices and opted to rent rather than buy.

In the current times, it is getting increasingly difficult to get loan to buy a home. People who are not able to secure credit to buy homes are turning to rental housing. In a way, the collapse of real estate bubble is converting into greater demand for rental housing and people are looking for affordable rental housing.

We hope to do our share to help people find their new place easily. Cazoodle Apartment Search provide comprehensive listings for major metropolitan areas like New York City, San Francisco Bay Area, Chicago, Los Angeles. And we are expanding quickly to other locations like Miami, Philadelphia, Detroit. Our plan is to cover the top 25 metropolitan areas by early next year. We are seeing increase in traffic month over month and hope to continue the growth as we expand to new locations and improve the overall system too. Cazoodle is a group of immensely smart, talented and passionate people that are constantly working to achieve their vision.

Arpit Jain

Developer Team

Posted in General | 2 Comments »

How we started Apartment Search

October 26th, 2008

The idea of apartment search came a long time when I was in undergraduate taking Kevin Chang’s(Founder of Cazoodle and CS professor at U of I) database course. My team chose to build apartment search for the class project and we innovated using the just launched Google Maps API. At the time, we were manually entering apartments into the database and we asked Kevin how we could gather more data. The traditional process, he said, was to hire programmers to write parsers for sites, and that effort could way be simplified in the future.

It was quite clear in my undergraduate project that collecting apartments is a technical challenge, and it can really bring value by integrating scattered apartment websites together in our school community and beyond. I noted his idea but never knew what advanced research would be required to get us there.

Right before my graduation in spring 2007, Cazoodle launched with a powerful data extraction tool that can build wrappers easily. Apartment search launched with bountiful data. Although created by a different team of 20 people and unrelated to my undergrad project, I was happy to see the opportunity the company has.

Launched in Champaign, apartment search has gained wide coverage of apartments in this area. Popular sites, like apartment.com had 5 apartments for Champaign, Google Base had 19. We have 100s of apartments collected, covering popular and lesser known landlords. Our approach is bottom up expanding location by location, crawling landlords and listing sites. The state of the art apartment sites are using the latter approach, and with our technology advantages, we use data crawling to cover comprehensively and exhaustively. Why build another apartment search? In short, we learned the Google way of making apartment search more comprehensive. We aim to introduce a one stop portal for apartment search not only to benefit the user, but also drive traffic to other apartment sites to complement their service.

It will require a leap of faith to believe how we can scale up to compete with the big players. In the past year, our technology has matured and we can scale to a new location in a short time. The dream to accomplish the entire US map is no longer out of reach. Targeting every location to be as comprehensive as Champaign, I hope every apartment you search online will be in our database in the near future.

Paul Yuan

Developer Team

Posted in General | 1 Comment »

Boston Launched

October 3rd, 2008

As promised we have added Boston as our next location. It covers 30 miles around Boston from HaverHill on the North, Markborough on the West, Bridgewater on the South and obviously Atlantic Ocean on the East :). We have collected ~8000 apartments from more than 70 Boston sources from local landlords to big national sites, we have them all. Boston got a little delayed than our target launch date of Sept 28 because our data crawling team was moving to a brand new office. Moving always take time and especially when you have to setup everything from electricity to computers yourself. Well, now everything is set and our data crawling team is happier than before and have promised to work harder :)

The next location will be Dallas. We recently finished collecting the apartments sites for Dallas and to our surprise found that there are more than 100 sources even more than Boston! We were expecting Dallas to be smaller than Boston, after all Boston is more popular right? But not to worry, we will work more hours and finish collecting Dallas apartments by the end of October. After that, as per the popular demand, we will work on Miami.

You can suggest your location at http://apartments.cazoodle.com and beat the highest number of votes.

Cazoodle Apartment Search has expanded and now also cover Chicago (~29000), New York City (~30000), Los Angeles (~15000), Seattle (~3500) in addition to SF Bay Area (~14500) and Urbana-Champaign (~625).  We plan to add one new location every 3-4 weeks. Boston and Dallas will be the next locations that we will add.

Your location not covered? You can vote your location at http://apartments.cazoodle.com and we will add it to our list.

If you have any feedback on the site, please submit it here.

Thanks,
Arpit Jain
Developer Team

We found that site takes long time to load especially narrow your search panel. We did load balancing, enabled caching, optimized code and now the site load much faster.

If you still find problem loading the website then let us know.

Arpit Jain
Developer Team

New Features Added

May 17th, 2008

We have added many new features to the site:

1. Apartments from different sources merged: If an apartment is present on multiple wesites, they are merged and shown as single apartment. In the infowindow, you can still see all the other sources.

2. Narrow your search panel resize itself: Depending upon the number of options in the narrow your search panel, it will resize it self. Removes a lot of pain.

We are continously upating the website and adding new features. If you have any suggestions please let us know.

Arpit Jain
Developer Team

Welcome to SF BayArea Apartment Search. Here you will find all apartments related to bayarea.

We have collected apartment information from over 250 sources in Bay Area so that you do not need to go anywhere else to search for apartments. Out interface provides useful functionalities to easily browse and search through many apartments.

We are continuously developing and updating the website and would keep you informed on this blog page. If you have any suggestions/comments on how to improve the apartment search experience please let us know.

Arpit Jain
Developer Team