Nuit Blanche: challenge

Showing posts with label challenge. Show all posts

Friday, September 23, 2016

It's Friday, it's Hamming's time: Call for deep learning research problems from your "problem surplus" stack

Francois Chollet the creator of Keras tweeted the following interesting proposition:

We're still collecting interesting research problems for our open science initiative: https://t.co/yIqIhTAbSS -don't hesitate to contribute!
— François Chollet (@fchollet) 22 septembre 2016

Here it is:

Call for deep learning research problems

Are you a researcher? Then you probably have a "problem surplus": a list of interesting and important research problems that you don't have time to work on yourself. What if you could outsource some of these problems to distributed teams of motivated students and independent researchers looking to build experience in deep learning?You just have to submit the description of your problem, some pointers as to how to get started, and provide lightweight supervision along the way (occasionally answer questions, provide feedback, suggest experiments to try...).
What you get out of this:

Innovative solutions to research problems that matter to you.

Full credits for the value you provide along the research process.

New contacts among bright people outside of your usual circles.

A fun experience.

We are looking for both deep learning research problems, and problems from other fields that could be solved using deep learning.
Note that the information you submit here may be made public (except for your contact information). We will create a website listing the problems submitted, where people will be able to self-organize into teams dedicated to specific problems. You will be in contact with the people working on your problem via a mailing list. The research process will take place in the open, with communications being publicly available and code being released on GitHub.

Here are some problems:

Enhanced NAVCAM image of Comet 67P/C-G taken on 18 September 2016, 12.1 km from the nucleus centre. The scale is 1.0 m/pixel and the image measures about 1.1 km across. Credits: ESA/Rosetta/NAVCAM – CC BY-SA IGO 3.0

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, June 09, 2016

A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention

The AutoML challenge has ended and they have a paper on the results.

A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention by Isabelle Guyon, Imad Chaabane, Hugo Jair Escalante, Sergio Escalera, Damir Jajetic, James Robert Lloyd, Nuria Macia, Bisakha Ray, Lukasz Romaszko, Michele Sebag, Alexander Statnikov, Sébastien Treguer, Evelyne Viegas

The ChaLearn AutoML Challenge team conducted a large scale evaluation of fully automatic, black-box learning machines for feature-based classification and regression problems. The test bed was composed of 30 data sets from a wide variety of application domains and ranging across different types of complexity. Over five rounds, participants succeeded in delivering AutoML software capable of being trained and tested without human intervention. Although improvements can still be made to close the gap between human-tweaked and AutoML models, this challenge has been a leap forward in the field and its platform will remain available for post-challenge submissions at http://codalab.org/AutoML.

Previously:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, May 13, 2016

Follow-up to 'Making Hyperspectral Imaging Mainstream'

Following up on this morning hyperspectral approach, here are some feedback from my proposal about Making Hyperspectral Imaging Mainstream?

Do you remember it ? No, well go read it, I'll wait.... Ximea, the maker of hyperspectral cameras, is still pondering the issue but I got a few very good interactions out of that idea. Here they are:

From the blog comment section:

Harrison Knoll said...: This is a great idea! We here at Aerial Agriculture have been collecting hyperspectral data and will be following your progress. Let us know if there is anything you need! ~Harrison

Someone from Movidius said...: Great idea - Movidius would be very supportive

Alex St. John said...: Agreed, hyperspectral space is the place for next-level analysis!

Also from Aerial Agriculture here, and am interested to follow up on this kind of project and continue building.

Alex

In the comment section of my LinkedIn feed:

Indir Jaganjac

Igor, see these hyperspectral images of natural scenes at Manchaster University site: http://personalpages.manchester.ac.uk/staff/david.foster/Hyperspectral_images_of_natural_scenes_04.html. Scenes were illuminated by direct sunlight in clear or almost clear sky. Estimated reflectance spectra (effective spectral reflectances) at each pixel in each of scenes images can be downloaded ((1017x1338x33 Matlab array). Hyperspectral imaging system that was used to acquire scene reflectances was based on low-noise Peltier-cooled digital camera providing spatial resolution of 1344x1024 pixels (Hamamatsu, model C4742-95-12ER) with fast tunable liquid-crystal filter.

Kyle Forbes

Experienced Software and Data Leader

That's why I started www.agrolytic.com, leveraging machine learning with hyper spectral and other spatial data to address information challenges in agriculture.

Very interesting !

All hyperspectral related blog entries are under the hyperspectral tag.

In other news, here is: Image-level Classification in Hyperspectral Images using Feature Descriptors, with Application to Face Recognition by Vivek Sharma, Luc Van Gool

In this paper, we proposed a novel pipeline for image-level classification in the hyperspectral images. By doing this, we show that the discriminative spectral information at image-level features lead to significantly improved performance in a face recognition task. We also explored the potential of traditional feature descriptors in the hyperspectral images. From our evaluations, we observe that SIFT features outperform the state-of-the-art hyperspectral face recognition methods, and also the other descriptors. With the increasing deployment of hyperspectral sensors in a multitude of applications, we believe that our approach can effectively exploit the spectral information in hyperspectral images, thus beneficial to more accurate classification.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Friday, December 18, 2015

Hamming's Time: Making Hyperspectral Imaging Mainstream

Friday afternoon is Hamming's time. Today I decided to compete in the Best Camera Application contest of XIMEA, a maker of small hyperspectral cameras. Here is my entry:

Challenging task: Make hyperspectral imaging mainstream

Idea: Create a large database of hyperspectral imagery for use in Machine/Deep Learning Competitions

Proposer: Igor Carron, http://nuit-blanche.blogspot.com , http://www.linkedin.com/in/IgorCarron

Background

Machine Learning is the field concerned with creating, training and using algorithms dedicated to making sense of data. These algorithms are taking advantage of training data (images, videos) as a way of improving for tasks such as detection, classification, etc. In recent years, we have witnessed a spectacular growth in this field thanks to the joint availability of large datasets originating from the internet and the attendant curating/labeling efforts of said images and videos.

Numerous labeled datasets available such as CIFAR [1], Imagenet [2], etc. routinely permit algorithms of increased complexity to be developed and compete in state of the art classification contests. For instance, the rise of deep learning algorithms comes from breaking all the state-of-the-art classification results in the “ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry” [3] More recent examples of this heated competition results were recently shown at the NIPS conference last week where teams at Microsoft Research produced breakthroughs in classification with an astounding 152 layer neural networks [4]. This intense competition between highly capable teams at universities and large internet companies is only possible because some large amount of training data is being made available.

Image or even video processing for hyperspectral imagery cannot follow the development of image processing that occurred for the past 40 years. The underlying reason stems from the fact that this development was performed at considerable expense by companies and governments alike and eventually yielded standards such as Jpegs, gif, Jpeg2000, mpeg, etc…Because such funding is no longer available we need to find ways of improving and working with new imaging modalities.

Technically, since hyperspectral imagery is still a niche market, most analysis performed in this field runs the risk of being seen as an outgrowth of normal imagery: i.e substandards tools such as JPEG or labor intensive computer vision tools are being used to classify and use this imagery without much thought into using the additional structure of the spectrum information. More sophisticated tools such as advanced matrix factorization (NMF, PCA, Sparse PCA, Dictionary learning, ….) in turn focus on the spectral information but seldomly use the spatial information. Both approaches suffer from not investigating more fully the inherent robust structure of this imagery.

For hyperspectral imagery to become mainstream, algorithms for compression and for its day-to-day use has to take advantage of the current very active and highly competitive development in Machine Learning algorithms. In short, creating large and rich hyperspectral imagery datasets beyond what is currently available ([5-8] is central for this technology to grow out its niche markets and become central in our everyday lives.

The proposal

In order to make hyperspectral imagery mainstream, I propose to use a XIMEA camera and shoot imagery and video of different objects, locations and label these datasets.

The datasets will then be made available on the internet for use by parties interested in performing classification competition based on them (Kaggle, academic competitions,...).

As a co-organizer of the meetup, I also intend on enlisting some of the folks in the Paris Machine Learning meetup group ( with close to 3000 members it is one of the largest Machine Learning meetup in the world [9]) to help in enriching this dataset.

The dataset should be available from servers probably colocated at a university or some non-profit organization (to be identified). A report presenting the dataset should be eventually academically citable.

References

[1]Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009, https://www.cs.toronto.edu/~kriz/cifar.html

[2] Imagenet dataset, http://www.image-net.org/

[3] ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

[4] Microsoft researchers win ImageNet computer vision challenge, http://blogs.microsoft.com/next/2015/12/10/microsoft-researchers-win-imagenet-computer-vision-challenge/

[5] https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html

[6] T. Skauli and J. Farrell. “A collection of hyperspectral images for imaging systems research”. In Proceedings of the SPIE Electronic Imaging ‘2013, https://scien.stanford.edu/index.php/hyperspectral-image-data/

[7] Foster, D.H., Amano, K., Nascimento, S.M.C., & Foster, M.J. (2006). Frequency of metamerism in natural scenes. Journal of the Optical Society of America A, 23, 2359-2372., http://personalpages.manchester.ac.uk/staff/david.foster/Hyperspectral_images_of_natural_scenes_04.html

[8] Parraga CA, Brelstaff G, Troscianko T, Moorhead IR, Journal of the Optical Society of America 15 (3): 563-569, 1998 or G. Brelstaff, A. Párraga, T. Troscianko and D. Carr, SPIE. Vol. 2587. Geog. Inf. Sys. Photogram. and Geolog./Geophys. Remote Sensing, 150-159, 1995,

, http://www.cvc.uab.es/color_calibration/Bristol_Hyper/

[9] Paris Machine Learning meetup, http://www.meetup.com/Paris-Machine-learning-applications-group/

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, December 10, 2015

Challenge: Computational Imaging for VLBI Image Reconstruction

Here is a reconstruction endeavor that could eventually help the Event Horizon Telescope: Computational Imaging for VLBI Image Reconstruction by Katherine L. Bouman, Michael D. Johnson, Daniel Zoran, Vincent L. Fish, Sheperd S. Doeleman, William T. Freeman

Very long baseline interferometry (VLBI) is a technique for imaging celestial radio emissions by simultaneously observing a source from telescopes distributed across Earth. The challenges in reconstructing images from fine angular resolution VLBI data are immense. The data is extremely sparse and noisy, thus requiring statistical image models such as those designed in the computer vision community. In this paper we present a novel Bayesian approach for VLBI image reconstruction. While other methods require careful tuning and parameter selection for different types of images, our method is robust and produces good results under different settings such as low SNR or extended emissions. The success of our method is demonstrated on realistic synthetic experiments as well as publicly available real data. We present this problem in a way that is accessible to members of the computer vision community, and provide a dataset website (vlbiimaging.csail.mit.edu) to allow for controlled comparisons across algorithms. This dataset can foster development of new methods by making VLBI easily approachable to computer vision researchers.

At vlbiimaging.csail.mit.edu, they have a Dataset Designed to Train and Test Very Long Baseline Interferometry Image Reconstruction Algorithms. From the first page:

Welcome to the VLBI Reconstruction Dataset!
The goal of this website is to provide a testbed for developing new VLBI reconstruction algorithms. By supplying a large set of easy to understand training and testing data, we hope to make the problem more accessible to those less familiar with the VLBI field. Specifically, this website contains a:

Large set of synthetic training data for many different VLBI arrays and targets

Set of real data measurements provided in the same standard format

Standardized data set for testing VLBI Image Reconstruction Algorithms

Online quantitative evaluation of algorithm performance on simulated testing data

Qualitative comparison of algorithm performance on the reconstruction of real data

Online form to easily simulate realistic data using your own image and telescope parameters

What is VLBI Imaging?
Imaging distant celestial sources with high resolving power requires telescopes with prohibitively large diameters due to the inverse relationship between angular resolution and telescope diameter. However, by simultaneously collecting data from an array of telescopes located around the Earth, it is possible to emulate samples from a single telescope with a diameter equal to the maximum distance between telescopes in the array. Using multiple telescopes in this manner is referred to as very long baseline interferometry (VLBI).
Reconstructing an image using VLBI measurements is an ill-posed problem, and as such each there are an infinite number of possible images that explain the data. The challenge is to find an explanation that respects these prior assumptions while still satisfying the observed data. The goal of this website to aid in the process of developing these algorithms as well as evaluate their performance.

The ongoing international effort to create an Event Horizon Telescope capable of imaging the enviroment around a black hole’s event horizon calls for the use of VLBI reconstruction algorithms. The angular resolution necessary for these measurements requires overcoming many challenges, all of which make image reconstruction more difficult. For instance, at the mm/sub-mm wavelengths being observed, rapidly varying inhomogeneities in the atmosphere introduce additional measurement errors. Robust algorithms that are able to reconstruct images in this fine angular resolution regime are essential for scientific progress.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Sunday, July 12, 2015

Sunday Morning Insight: The Challenges of Reddit's Sparse Admins/Mods Graphs and Sudden Phase Transitions

There is a large part of the internet industry that focuses on doing things faster by using the prior information that everything is on a graph ( this previous entry uses the graph to devise a reconstruction solver, companies like, Google, Facebook or Dato also use this as a central asumption) but there is very little care on finding out how the network behaves over time. Sure enough search engines like Google have developed ways to handle the dynamics of the web over time, social networks like Facebook or LinkedIn do in fact try their best in designing the right recommendation engines to avoid link/friend/job announcement fatigue but the reality is that predicting things on the dynamics of the graph is most of the time left as an empirical exercise.

Sparse Graphs and Sudden Phase Transitions

It happened before on F##@Company, Digg, Myspace or even earlier ( see Clay Shirky's 2003 A group is its own worst enemy h/t Hugh McLeod), the Reddit on-going implosion (explained here and here) is just one of those seemingly sudden phase transitions. To put these stories in the context of what we read here on Nuit Blanche, the Reddit moderators belong to a sparse graph within the much denser population of users. That sparse graph also interacts, very modestly it seems, with the graph of paid administrators of the site. Given this tidbit of context, I wonder if some of the dynamics on Reddit could not be better foreseen using some of the user-mods-admins data and the statistical physics of networks as explained to us by Lenka Zdeborova at the Paris Machine Learning meetup #8 Season 1 ( slides How hard is it to find a needle in a haystack? ).

It is one thing to figure out that a graph is sparse - Duh! the number of moderators is low- it is yet another to figure out the dynamics between the sparse set of unpaid and volunteer moderators, the inner group of paid admins at Reddit and the rest of the much denser community. The fate of Reddit hinges on its ability to mend fences between the first two communities while at the same time pleasing the third community. The first question one should ask is whether Reddit has the ability to build links and trust so that they get away from a perilious metastable region. The second question is algorithmic, as shown by Lenka and colleagues: if the wrong spectral operator is used to figure out the dynamics of these graphs, you might be over/underestimating by a large margin the efforts it takes to get away from this metastable region.

It is one thing to build trust between communities in some holistic fashion (aka through PR and some information control), it is yet another one to quantify the fragility of these networks and the different ways to make them robust over time.

Recent work by Lenka and colleagues:

MaCBetH : Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation - implementation -

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, April 23, 2015

AutoML Challenge: Python Notebooks for Round 1 and more...

In the Sunday Morning Insight entry entitled The Hardest Challenges We Should be Unwilling to Postpone, I mentioned a challenge set up by Isabelle Guyon entitled the AutoML challenge ( http://codalab.org/AutoML, her presentation is here). In short, the idea is to have a Kaggle like challenge that features several datasets of increasing difficulty and see how algorithm entries fare with these different datasets. Deep down, the algorithm needs to pay attention to its own running time and have a nice way of automatically select relevant features.

With Franck, we decided to use the mighty power of the large membership of the Paris Machine Learning meetup (Top 5 in the world) to help out in the setting up of a day long hackaton so that local teams could participate in the challenge. Currently round 1 of the challenge is over we are currently in the Tweakathon1 stage where you can submit codes that will eventually be run automatically on May 15 for AutoML2. From here:

Tweakathon1

Continue practicing on the same data (the phase 1 data are now available for download from the 'Get Data' page). In preparation for phase 2, submit code capable of producing predictions on both VALIDATION AND TEST DATA. The leaderboard shows scores on phase 1 validation data only.

AutoML2

Start: May 15, 2015, 11:59 p.m.

Description: INTERMEDIATE phase on multiclass classification problems. Blind test of the code on NEW DATA: There is NO NEW SUBMISSION. The last code submitted in phase 1 is run automatically on the new phase 2 datasets. [+] Prize winning phase.

Tweakathon2

Start: May 16, 2015, 11:59 p.m.

Description: Continue practicing on the same data (the data are now available for download from the 'Get Data' page). In preparation for phase 3, submit code capable of producing predictions on both VALIDATION AND TEST DATA. The leaderboard shows scores on phase 2 validation data only.

Here are some of the presentations made during the hackaton and some of the attendant python notebooks released for tweakaton 1:

Isabelle Guyon and Lukasz Romaszko (ChaLearn): Presentation of the AutoML challenge. Tips to solve it and win!
Olivier Grisel (INRIA): How to use Scikit-Learn to solve machine learning problems.

iPython notebook example given in talk
iPython notebook to solve round 1of the AutoML challenge

Julien Demouth (NVIDIA): Deep Neural Networks and GPUs.

The page for the hackaton is here. A big thank you to Pierre Roussel for hosting us at ESPCI ParisTech and to the coaches

Delphine Le, Ecole Centrale Paris
Djalel Benbouzid, Univ. Paris Saclay
Bogdan-Ionut Cirstea, ENST, Paris
Olivier Grisel, INRIA, Paris
Lovro Ilijasic, Univ. Paris Saclay
Lukasz Romazko, CVC Barcelona, ChaLearn

Tuesday, February 12, 2013

Ben's Chicken Challenge

We first had this magnificient dataset in Leonardo's Challenge of a 65 million years old dinosaur. Well, today we have something that tastes closer to chicken: a chicken. Ben Krasnow has a hobby of building things, in short he is a Maker. Ben recently started building an X-ray system in his garage. To make things more interesting he used an Arduino to command a rotating plate and an intervalometer in order to obtain a CT system of sorts (see DIY X-ray CT scanner controlled by an Arduino). The camera taking photo of the phosphorous medium is a DSLR and shoots at an angle as shown below:

here is the full video explaining what he did:

Ben made available the 45 raw images from this experiment, they are here: One could do a lot of things with those such as:

CT dictionary learning for synthesis or analysis (and by the same token discover the operator that emulates the Radon transform)
an uncalibrated SIFT or FREAK based 3-D reconstruction
....

If you make something awesome out of this dataset, I'll feature it on Nuit Blanche. Ben's YouTube channel is here, his blog is here. Other datasets and challenges are listed here.

Join the CompressiveSensing subreddit or the Google+ Community and post there !

Thursday, August 09, 2012

The Curiosity Super-Resolution Challenge

This is the low resolution panorama picture of the Gale Crater, the landing site of Curiosity. Here is an interesting challenge for the week-end. From the BBC interview of Mike Malin:

"The individual frames are only 144 by 144 pixels. There are 130 of them in there. It took us about an hour and six minutes to take the mosaic.

"For the full-resolution panorama, the data volume will be 64 times larger, [and] the resolution will be eight times better. But this was pretty enough and interesting enough that we thought it was worth sharing with you guys," he told BBC News.

Many of the raw images are high resolution:such as this one:

All the raw images collected so far are here.

Here is the challenge, with a bandwidth of 31MB/day, it is going to take a while to get a high resolution panorama of the Gale Crater. Can one build a high resolution panorama using some dictionary learning from the raw images and "super resolve"/inpaint the current low resolution of this panorama ?

Recent related posts of interest to this challenge:

Image credit: NASA/JPL-Caltech/MSSS

Wednesday, December 07, 2011

Leonardo's Challenge

Forget Darpa's shredder challenge, this one has waited millions of years to get to us. You probably recall this entry on Leonardo, a fossilized mummy of a 65 million years old dinosaur (i.e. internal organs and the skin have been fossilized!). If you don't remember that entry, please take a second to read it again. I also added the attendant video:

I have been talking to some of the folks involved in this video. In particular Tom Kaye and Art Anderson. Tom is the person who provided software that could enhance the shots while Art is the person behind the actual X-ray taking at Ellington Fields (the shots were so powerful they had to do them at night when nobody would be around the hangars where the X-ray machine performed).

These X-ray shots are very unique and Art kindly provided a large suite of these shots for this challenge. Leonardo's challenge is pretty simple: Can we reconstruct something beyond just playing on the contrast ? Reconstruction may not mean 3D reconstruction, one could simply play with the contrast and SIFT points to assemble the different images together (as we don't have any reference on how these shots were taken)..it's your turn to be smart about how to use this pretty unique dataset....I'll feature the best efforts on the blog.

Here some example of enhancement Tom did: