FMC1202: The Wonderfully Weird World of Software: September 2009

Tuesday, September 22, 2009

Presentation session on Sep 15th

Here's my take on our presentation session on Sep 15th. This was our first presentation session, so I thought of writing a roundup myself.

"Windows 7, the next generation of surrounding awareness applications" by Tan Chun Siong & Ho Tian Shun

The very first FMC1202 student presentation (historical moment ;-) was done by Chun Siong and Tian Shun. Thanks to both for a good presentation and for volunteering to go first.

Among other things, the presentation introduced us to the Windows 7 "sensor framework" API that helps us to write "surrounding-aware" applications in a hardware-independent way. According to what I understood, now we can write surrounding-aware applications without having to worry about specifics of the hardware component being used by the application. For example, we can write an application that deals with GPS data without worrying about who manufactured the GPS module being used. It was good to see the claims backed up by code examples and live demos. The demo of auto adjusting text size to match the ambient lighting was cool. Thanks for the Windows 7 DVDs too - we know you went out of your way to produce those for us.

"Facebook the social network" by Ma Siming & Zhan Shiyu

This presentation started with a "not so technical" look and advantages and disadvantages of social networks. I especially liked the part about ways to make money using social networks. I agree that it is a good place to play games. In fact, it appears most of my Facebook friends are playing "Barn Buddy" Farmville etc during office hours. Fortunately, their bosses are not in their friends list or else they will be out of job before long. Another point raised was the added level of trust when business happens over a social network. That is very true. I'd rather buy from someone I know (directly or via a friend) rather than a total stranger. However, I later thought of two things that could cause problems.

People may prefer to sell to strangers (if you sell to friends, you have to lower profit margins and you risk loosing your friend if something goes wrong with the sale or the merchandise. too messy, don't you think?)
Once the intent to sell/buy is established via the social network, buyer and seller might do it outside the social network (to avoid paying commission). This defeats the purpose from the point of view of the social network.

A question from the class resulted in a discussion about one danger of using virtual worlds: some users might not realize where the fantasy ends and the real world begins. For example, someone who plays car racing games in the virtual world be tempted to speed in real-life driving too. It is a good point indeed.

"How videos are processed?" by Zhang Haoqiang & Li Shiyan

I will always remember this presentation as where I learned the meaning of 'video'. (According it Haoqiang, it derives from the Latin word that means something like "I see"). I liked the bit about the history of video processing as well. Often knowing how something was done in the olden days helps to understand how it is done now.
The mid-presentation discussion on how digital is different from analog was interesting too. Later I found this page that explains the difference.

Another insightful point mentioned was why videos are harder to encode (it takes hours of processing and a high-end PC to encode a video) than to decode (most low end PCs can play videos without any time lags).

In the 2nd part we went a bit deep (perhaps a bit too deep) into AVI format.

"Things you wished you knew about spyware / malware" by Tan Chun Siong

Our self-professed "geek" Chun Siong came back to share his knowledge about spyware/malware with the class.We all should be thankful to Chun Siong for sharing many useful tit-bits of info with us. For example, the bit about using on screen keyboards thwart key loggers and dangers of using pirated software were very useful.
Thanks for the tip about VirusTotal. It is a handy online virus detector that uses multiple detection engines. I used it several times already since I got to know about it form Chun Siong.
As mentioned, bloatware is an annoying problem too and Acrobat is one of them. Has anyone tried to convert a doc to pdf using Acrobat writer? The visual "acrobatics" it performs during the conversion is as annoying as the long time it takes to finish the conversion.

To wrap up, it was a very good presentation session considering this was our first. Kudos to all six presenters once again...

Thursday, September 17, 2009

How YouTube Works

This report is based on the interesting seminar on last Friday, Sep 4^th 2009. Actually, for security reasons, no one knows exactly how YouTube works, except for professional programmers working for Youtube. Therefore, most of the discussion we did in the seminar about this biggest video-hosting site is theoretical and based on subjective opinions. However, getting to brainstorm the process of creating such a great video-hosting site brought us a lot of inspiring ideas.

In the seminar, we discussed 6 problems that a video hosting site has to confront and how YouTube solves the problem.

1. Video compression:

According to the latest statistics, 20 hours of videos are uploaded to YouTube every minute. Assume the video has resolution of 640x480, frame rate of 30 frames/second and the frames in the videos are stored as consecutive bitmap images (1 pixel on the image is represented by 3 bytes), without compression technology, YouTube would have to store approximately 2 TB of video per minute. Even if videos had the same compression rate as a Word document (~25%), YouTube would still have to store 0.5 TB of information losslessly compressed.

Therefore, a video compression technology must be applied to reduce the amount of information that we have to store. As most of you know, any color can be displayed on the monitor by mixing a certain degree of red, green and blue together, and an image in bitmap format also stores color of each pixel as RGB. However, there are other ways to represent a color and one of these system is called HSL (Hue, Saturation, Lightness/Luminosity). Since human eyes are more sensitive to brightness than color, the piece of data representing brightness can be thrown away for a certain number of pixels in the picture.

In a video, there are a lot of temporal redundancies, i.e. the part of the image that stays the same for multiple frames. Hence, we can record the changes between one frame and the next instead of the whole frame to reduce the amount of information. Many video codecs use this principle to reduce the size of the video but still retain a part of the quality. Another method to bring down the video size is to reduce the quality, i.e. sharpness of image, video resolution, etc. All the methods we discussed above use the principle of lossy compression: since we compress by throwing away information, it's impossible to decompress it back to the original.

In fact, YouTube is using FLV format to serve video to users. This format allows a typical compressed video to be further compressed to only 25% of the original size. Of course, that comes with a reduction in resolution and quality. Recently, YouTube has introduced MP4 format to cater videos in high definition mode to users.

2. Video distribution:

YouTube has to serve 1.2 billion views daily. So how do they manage to stream the video to viewers at high speed and at the same time keep their Internet bill low?

Firstly, YouTube installs servers in many places around the world. This allows much faster connections between the users and the server because instead of establishing a possibly slow connection between 2 locations on opposite sides on the Earth, a local guaranteed faster connection is established for uploading and downloading videos.

Secondly, aside from using their own CDN (Content Distribution Network), YouTube partners with other CDNs such as LimeLight, LiveStream (formerly known as Mogulus), etc. to serve their videos. There has been an instance where YouTube relies on Akamai, another CDN especially experienced in streaming live content, to stream YouTube concert live to 700,000 concurrent viewers.

Thirdly, YouTube signed Peering contracts with ISPs to bring down their Internet bill. Basically, if ISP A transfers data to ISP B and vice versa for the same amount, both of them don't have to pay each other money for carrying the data.

3. Video fingerprint:

While YouTube is an effective way to share one's own video footage, it also offers the potential of spreading copyrighted contents, pornography, defamation and material encouraging criminal conduct. Therefore, a system to detect and delete these kinds of videos as soon as possible is needed. In our discussion, we mainly take guesses at how YouTube detects copyrighted contents uploaded to their server in only several minutes without relying too much on the users to flag the videos.

Video fingerprint is defined as features that make the video unique and therefore easily recognizable when given the fingerprint. Several ideas for creating a video fingerprint are suggested during the discussion. One idea is to compare the uploaded video with the copyrighted content to find any similarity. However, it would require a copy of the source on the server, and that would require a lot of resources. Another idea is to store the color ratio of several-minute-worth number of frames as fingerprint to lessen the amount of data that has to be stored, to enable the video to be identified regardless of size and also to prevent false recognition. There is also another suggestion that checksums of the pirated versions of videos (e.g. CRC32, SHA1, etc.) be stored to compare with what the user uploaded. The fingerprint produced with this method is truly unique and small compared with the previous ones. However, this method will not work if the file is edited or split into multiple parts.

4. Video thumbnails:

Video thumbnails are the tiny images that give the user a glimpse of the content inside the videos. Most users decide a video worth watching or not based on what they see in the thumbnail. Therefore, having a good thumbnail that reflects the content inside is very important and will decide the popularity of a video. Currently, YouTube captures the frames at 25%, 50% and 75% of the video to use as thumbnail. However, we all agree that this method does not guarantee to give user the best overview of the video. In this section, we mainly discuss how we can improve the system of capturing thumbnail to make it display the subject-matter of the video.

We can't just pick the frame at a particular position or randomly pick any frame because that has been proven ineffective through practice. Hence, we have to intensively process the frames inside the video to pick out the most appropriate one(s). There are many suggestions proposed during the discussion: face recognition (e.g. get the face of the distinguished person in the video showing his speech), OCR (e.g. get the text/banner/etc. in the video that matches the title), or search for the object that occurs most frequently in the video.

5. Video recommendation

Video recommendation is a function on YouTube meant to provide viewers with possibly useful videos to watch next. However, all of us agree that sometimes, this feature doesn't help us choose a good video to watch at all. In our discussion, we go over some of the possible methods to improve the system of recommendation.

The first way is, we can ask the users to indicate their preferences in the profile so that a better recommendation can be made based on information given. The second way is to record what the previous user views next after watching the same video, we then can sort out the most popular ones to recommend to the user in question. The third way is basically similar to the second way; however, instead of looking at the statistic of only one video and only the previous user, we will search for other users who have watched the same videos, then look for videos that other users have also watched but our user in question haven't and recommend it to the user in question. As we can see, the second method will work better for new users who have yet to watch many videos, and the third method will be more effective to old users who have a history of watching video that we can rely on to give our recommendation.

6. Video spammers

YouTube implemented video response function that allows users to post a video in response to other users' video. However, this function has been misused by malicious users to post video responses irrelevant to the topic in discussion. Their objectives include and not limited to advertise, increase the popularity of their own videos, distribute pornography or simply pollute the system. (From this article). Hence, we need to devise a strategy to detect and apply punitive action against video spammers as soon as they pop up.

Our strategy involves in compiling a list of 'good' users and 'bad' users based on their history (how many times have their videos violated term of service, how many times have they been warned/(temporarily) banned for their actions, how many times have they been flagged by other users, did they post the same spam video anywhere, etc.). We will then find the common attributes among the 2 groups of users and use that to classify new users. In the discussion, several characteristics of a video spammer has been suggested, e.g. no favorite video, no friend, not having watched many videos, new user, etc.

In conclusion, although we didn't get to know exactly how YouTube works behind the scene, the seminar has educated us about the difficulties YouTube encountered in the past and is facing now, and how they resolved and can possibly resolve the problem.

by Hong Dai Thanh and Dang Dung Ha

*We apologize for putting this up more than 2 weeks late.

Monday, September 14, 2009

round up of virus session.

Before I attend the session on virus, I thought Windows Vista is so stupid, always asking me for permission. Then Dr Liang showed us that it is necessary to prevent my computer being utilized by third party.

I should say it was like watching a scary movie, besides we are told we will be the next possible victims. Firewalls and anti-virus only stop the hackers to some extent. Hackers have many ways to attack a system, it only takes them few seconds to control a computer. The conclusion is that we need to know more about the computer so that we can detect suspicious behaviors.

Hackers can attack directly, however this can be stopped most of the time by firewall.

Hackers can also use holes in computers such as to create files with hidden sub-fix to craft the virus in the way to make use of the hole to get control of the computer.

Hackers can make use the spill over of the code on memory.

overflow of data ---->spill over data on different location on disk -------->happen to be the execution code region ---->the original execution code is overwrite by the hacker's code.

Hackers can attack server by using large zombie network to send request to server, then the server will crash. One method is called DDOS.

In a world where e-business is becoming the trend, hackers have the interest to attack other computers so as to get the important information such as account and password. That makes it more important to have a safe secure system for the users. However, we as computing students, need to know more about the computer, that is the safest firewall in the world.

In the national level, strategic information stored in computer can be stolen by hackers of other country, and the cyberspace has becoming a new battle field. Recently, S.Korea announced that they are going to form a strategic information command centre.

I hope even if we have the power to attack a computer,we shall use it in a more responsible way. Besides, the legal system is catching up with the technology, it risks your future to be a hacker.

written by Shiyu and Jiangkuo

Monday, September 7, 2009

Youtube*

4 September 2009 (Friday) ‘s seminar was interactive and useful. We learnt a lot from Dr Ooi Wei Tsang about how YouTube manages the videos uploaded by users worldwide. In that seminar, he discussed about video compression, video distribution, video thumbnails, video fingerprinting, video spammers and video recommendation. Below are a few highlights on what was discussed…

What is video compression?
As many know, videos take up a lot of space since 1 pixel= 1 byte for red + 1 byte for blue + 1 byte for green and thus, it has to be compressed before they are being put up on web. In the seminar, our speaker mentioned that one way of video compression is lossy compression. This type of compression will however cause the compressed video file to have less data and this may affect the quality of the video. Another way would be to make use of temporal redundancy where we can reduce data by looking for pixels in two video frames that have the same values in the same location.

How videos are distributed?
Videos uploaded to YouTube are stored in a system of computers, placed at various places in the world so that user can download videos from the server that is near their location instead of from the central server so as to minimize a bottleneck near the central server. This system of computers is called the content delivery network (CDN) as mentioned by our speaker. He also mentioned that limelight provides the content delivery network service (www.limelightnetworks.com).

What is video fingerprinting?
Video fingerprinting is a technique that helps to prevent users to upload videos that infringe the copyright laws. As the word ‘fingerprinting’ suggests, this technique will help to check whether the video uploaded by the user has unique images by comparing with a library of materials that are marked as copyrighted. Video fingerprinting analysis may be based on key frame analysis and the colour and motion changes during a video sequence.

Video thumbnails and Video recommendation
The thumbnail of a video is what users see when searching or browsing YouTube videos. These thumbnails will provide some screen shots of the content of the video so as to allow the users to decide which video they want. Video recommendation is done through filtering out those videos that are most commonly watched by people. Recommendations can be done by asking the user to rank the videos that they have watched. YouTube also has the active sharing feature which allows their watchers to signal to other viewers that they are watching the same video at a particular time.

In conclusion, the seminar on how YouTube works was useful as it provides me with lots of knowledge that I never knew and it also allows me to understand the ‘huge’ amount of work behind the scene of YouTube (from uploading of videos to distribution and preventing the infringement of copyright laws).

By:Geck Keat and Byron

Youtube*

Youtube, state of the art entertainment provider, the host of billions of videos produced by millions of people of various diversity, skin colour, professionals, amateurs alike. The quantity of videos to be handled and processed by the company goes by the millions each day. It is really amazing to see how they could handle these problems so well and properly while still ensuring that they are earning despite not charging people for viewing the videos.

Youtube deals with so many videos each day. So, how do they actually make sure that these videos are legal? How did they manage to store so many video files on their systems? The files we are talking here are EACH in the hundreds of megabytes. And they receive 100 over of such files in a minute! Just imagine how much space they need just to store all this files, an the super high cost they have to bear for them! How again, could they make sure that the videos that the users upload are legal? A copyright lawsuit would, definately cost them millions of dollars!

Thanks awefully, the smart people in youtube checks and convert each and every file from the original format, to the default format, FLV. The idea of flv is such that, although the quality is not very good due to the loss of image information, it only takes a very small file size, with reasonable quality. They, too, check each file using frame checks to make sure the files uploaded are legal. Apart of frame check, they also do colour differentiation, to make sure the files are genuine and original.

Even business wise, they are clever to collaborate with various network providers to make sure that they do not need to store all the files within their own systems. Apart from that , the collaboration also makes loading of the videos faster! The ingenious way of earning via advertisements too, generate a large amount of profit for them.

To earn without loss, yet providing good entertainment to the society.
kudos to the Youtube!

By: Byron and Geck Keat

* Please note that we are posting because we did not see anyone posting any round-up on this. If you are the rightful writer, please feel free to erase our names and put your own.

Thursday, September 3, 2009

Seminar roundup: A First Look at Second Life

On 28 August 2009, Mr John Yap gave us a talk on Second Life and showed us how NUS Second Life works.
Second Life is a virtual world which is developed by Linden Lab. In this virtual world, users can interact with each other through avatars. It is a virtual platform for users to socialize and participate in group activities. Users also can create and trade virtual property and services, travel to many parts of the world within seconds and many more. Second Life is restricted to users aged 18. However, there is also the Teen Second Life for teenagers between 13 and 17 years old.
Second Life requires the use of avatars. Avatar is a character that the user chooses to represent himself/herself on the Second Life. As the name Second Life suggests, this virtual platform allows users to have another life (which is virtual) beside the usual, normal daily life. Users can choose any appearance to represent themselves according to their wishes.
Furthermore, users can socialize and make new friends, just like our real life. It is as though users can live on a new life at Second Life. More often than not, users tend to live a completely different life from real life. Thus, users can get to experience a lifestyle that they wanted although it is impossible in real life.
NUS Second Life is another version of Second Life that is specially created to cater to the needs of NUS students. It is a platform where students from NUS can chat with their peers, tour around the NUS campus (such as University Hall), have online discussions for projects, etc.
Other than socializing, students may also require to attend lessons at Second Life if their teachers require them to do so. This will in turn increase the interactivity of lessons.

Written by Geck Keat and Li Hua.

FMC1202: The Wonderfully Weird World of Software

Tuesday, September 22, 2009

Presentation session on Sep 15th

"Windows 7, the next generation of surrounding awareness applications" by Tan Chun Siong & Ho Tian Shun

"Facebook the social network" by Ma Siming & Zhan Shiyu

"How videos are processed?" by Zhang Haoqiang & Li Shiyan

"Things you wished you knew about spyware / malware" by Tan Chun Siong

Thursday, September 17, 2009

How YouTube Works

Monday, September 14, 2009

round up of virus session.

Monday, September 7, 2009

Youtube*

Youtube*

Thursday, September 3, 2009

Seminar roundup: A First Look at Second Life

Followers

Blog Archive