Nuit Blanche: sudoku

Showing posts with label sudoku. Show all posts

Monday, January 09, 2012

Toward Robust Science: Why Open Access of Government Funded Peer Review Work is Important

Here are some of my thoughts with regards to comments to the OSTP RFI on Open Access to Government Funded Peer Review work. If you do respond to the RFI directly do not hesitate to use any of the words below.

Executive Summary

In the past few years, we have seen a perfect storm mixing the growth of two phenomena, a data deluge stemming from access to cheap sensing and computational equipment and the growth of scholarly publications. At the same time, there has been a near constant supply of reviewers. Open access to government funded work is the only short- and long-term policy decision that quickly enables a larger pool of quality reviewing capability aside from imposing reproducible research standards. In the end, it enables a more robust scientific process.

Introduction

With the advent of cheap high throughput equipment, we are seeing the emergence of what one would call "the curse of dimensionality", .i.e. the ability to produce cheaply large amount of data and the somewhat still limited ability to makes sense of them. This data deluge is, in turn, the primary reason behind the growth of the number of scholarly journals and journal articles over the recent few years. Unfortunately, the pool of potential reviewers has remained about the same and has not caught up to the level needed to deal with these two growth factors. One can certainly wonder how this is having an impact on how Science is performed (i.e. judged). In particular, the growth of the number of journals has eventually yielded a reliance on a lower number of potential high quality reviewers per journals. More insidiously, the growth in data production and/or computational experiments has removed from most time-constrained reviewers the physical ability to take on real reviews.

Peer-Review

In light of this situation, the current response by non-profit and commercial publishing entities has been to exacerbate the problem by opening the gate for newer journals and conference venues instead of developing innovative processes to do the one function that is generally thought to be their value added to the process of scientific discovery: The management of the peer-review process. An item of considerable interest is the current lukewarm ability by publishers (commercial or non-profit) to deal pro-actively and fairly with retraction. In particular, there is currently no system in place for publications to address the fact that they may have referenced a recently retracted publication for instance.

Under a regime of government funded open access of publications, new or older players could change the way peer review is performed by enabling systems like a post-peer review capability. This is just an example but innovation has to enter this market in order for the different stakeholders to continue on producing high quality work, at the lowest price to the government.

Conclusions

The interest of the US Government to have open access of government funded work can be clearly delineated into the following reasons:

Open access opens the ability of non-time constrained post peer-review processes by a larger pool of reviewers, thereby enabling a more robust scientific discovery process.
Open access provides the ability for innovation in the marketplace by enabling new (commercial or non-profit) actors in the peer review process. The new players may provide the ability to create new opportunities that are currently seldom explored by the current landscape.
Open access potentially reduces some large cost to the government in its ability to deal effectively with past flawed work and attendant retractions. Some of these retracted works may have had broad policy implications.
Open access comforts the United States leadership in manners related to Science and Technology development.

Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, January 06, 2012

Sudoku and the Donoho-Tanner Phase Transition

Dan let us know that some folks back in Ireland (There is no 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem by Gary McGuire, Bastian Tugemann, Gilles Civario) have found that if you give 16 clues in a Sudoku's problem then, there is probably more than one solution to it. With 17 clues, then there is only one solution. They seem to have gone through an exhaustive search to find that result.

The reason I mention this today is because I wonder what the generic Donoho-Tanner phase transition can bring to the table. Let us remember that the Sudoku problem was set up as the recovery of a solution to an underdetermined linear system of equations (Linear Systems, Sparse Solutions, and Sudoku by Prabhu Babu, Kristiaan Pelckmans, Petre Stoica, and Jian Li, implementation in Python available thanks to Ben Moran), i.e. a typical instance of compressive sensing. The problem was then solved using a reweighted L1 approach but it looks like there are problem that cannot be solved with it. Since then, Gabriel Peyre and Yue Lu made another version of that solver and have a numerical tour discussing a matlab implementation.

If I recall correctly, the Donoho-Tanner phase transition shows that (see Jared Tanner's presentation at Texas A&M in 2008) under a specific curve, there was only one sparse solution to these underdetermined systems of linear equations (see Precise Undersampling Theorems for a more in-depth look)

I haven't had time to look back at the set up of the Sudoku problem as an underdetermined system of linear equations or if the solution has to be positive but I wonder where this 16/17 result recently found through brute force fit in this diagram. Do you know ?

Tuesday, August 30, 2011

The Compressive Sensing Handle

What are the connection between the different problems of finding a defective coin or coins within a large set of known non defective ones when there is:

a single absolute weighting of an infinite number of coins with no knowledge of the imperfection,
a relative weighting of 12 coins with no knowledge of the imperfection of any one coin,
the relative weighting of 27 coins with the knowledge of the imperfection of one defective coin ?

It could be coins, precious rings or balls, they all ask the same question. How do you assemble some measurement process in order to recover an ugly duckling ? More important, all these riddles are a handle: The story reminds you

that the act of finding a defective coin is not as important as how you find it,
that compressive sensing that can be made simple to describe and eventually map to a large set of fields.

Why is that important ? Being familiar with a particular problem sometimes hinders your ability to identify the big picture. You may have worked for years in coded aperture yet cannot make a clear connection between that and fertilizers and crops. You may know certain techniques with no firm theoretical grounding yet when you hear about compressive sensing, you know and find that there is indeed a connection. The handle acts as a Rosetta stone. One breakthrough in one field immediately translate into many other fields. Empirical methods that work all the time cannot remain empirical. Should you wait for this theoretical underpinning to come to fruition before you can publish ? No worry, the filter bubble of peer review make that decision for you

At a different level, one can also ask whether Marcel Kolodziejczyk's solution to the balance puzzle provides a new insight for better or faster sparse signal recovery solvers used in compressive sensing and other fields ? Maybe. Can one map the coin problem into a Sudoku problem ? it would be a interesting path.

What are the handles for the different matrix factorization techniques, structured sparsity norms that are springing up left and right ?

Two courses are offered this fall on compressive sensing and beyond that might help in developing this insight:

CSE 709: Compressed Sensing and Group Testing, Part I (Fall 2011 Seminar) at SUNY Buffalo by Hung Q. Ngo and Atri Rudra - Check the course blog (see classes and courses)
EE 6886, Fall 2011 at Columbia, Sparse Representation and High-Dimensional Geometry by John Wright.

P.S: Marcel worked out the 40 coin problem with four weightings, I didn't know somebody had done that.
P.P. S.: A recent entry on the connection between compressive sensing and group testing.
P.P.P.S: Terry Tao uses the same handle.

Liked this entry ? subscribe to the Nuit Blanche feed, there's more where that came from

Wednesday, August 25, 2010

CS: Sudoku using POCS and Sparsity, Theoretical CS Q&A, "they had no idea", some meetings.

Gabriel Peyré just let me know that he and Yue Lu implemented a new numerical tour featuring Sudoku using POCS and Sparsity. From the website:

This numerical tour explores the use of numerical schemes to solve the Sudoku game.

Contents
Installing toolboxes and setting up the path.

Game Encoding and Decoding

Encoding Constraints

Sudoku Rules Constraints

Inpainting Constraint

Solving the Sudoku by Binary Integer Programming

Removing the Binary Constraint

Solving the Sudoku by Projection on Convex Sets

Decoding Using L1 Sparsity

Decoding Using more Aggressive Sparsity

This tour was written by Yue M. Lu and Gabriel Peyré.

I'll feature this example in the Compressed Sensing games page. One of the things that came out of the discussion with Gabriel was that there is a set of problems that cannot seem to be solved by using the L1 /sparsity or the POCS solution. Unbeknownst to me, there are difficult Sudoku problems and it looks like that sparsity alone is not a good enough a constraint to provide a solution for some problems. such as the AL Escargot. I learn something everyday. If you recall Gabriel has released two other examples of compressive sensing already in his numerical tours:

Suresh let me know on Twitter that there is some interesting activity on the Theoretical Computer Science Q&A website on Compressed Sensing, and surely enough one of the best answers so far is that of Piotr Indyk. From now on, my daily search for stuff on compressed sensing will include this Q&A.

You probably recall this story of the Kodak engineers who acknowledged they had no idea about the usefulness of the first digital camera. Well Dick Gordon sent me the following as a comment:

Sunday, August 22, 2010 1:39 AM from Winnipeg
Re: the Kodak engineers produced the first digital camera...
Maybe a line from the technical report written at the time sums it up best: “The camera described in this report represents a first attempt demonstrating a photographic system which may, with improvements in technology, substantially impact the way pictures will be taken in the future"....But in reality, we had no idea.”

Nonsense. I had written them a letter (in the days before e-mail!) suggesting that they go one step further, mosaicing focal planes to select in-focus pixels and produce the first camera that could not take an out of focus picture. Got no reply. The other obvious problem was lack of self-confidence at Kodak: they made a digital back for a Nikon camera. Nikon is still around, and Kodak is just about dead. Moribund leadership for a once great company, I suspect.

I know a fellow at University of Manitoba who still has this Kodak unit.
Yours, -Dick Gordon gordonr@cc.umanitoba.ca

In a different direction, somebody asked me to provide a hyperlink linking his/her name to his/her webpage. Please, if your name appears with no hyperlink or to the wrong one, let me know and I'll try to correct it asap. To give you an idea as to why it is important to be linked from Nuit Blanche, I usually receive some spam requests asking to include links on recent entries for free. I also get some amount of spam from people trying to use the comment section to l.ink back to their "content". Recently however, something new happened, I received an email from a gentleman who proposed to pay a hundred dollars to plant a link in an old entry of this blog... this is interesting on so many levels. With no further due and using you, the readers as filters, here is a set of meetings that are probably of interest to you: Pierre Vandergheynst let me know of a meeting in Marseille mostly in French on signal processing in biology. It is here. Mark Plumbley also let me know of the INSPIRE 2010 Conference in London.

INSPIRE 2010 Conference on information representation and estimation
Date, Time & Location

University College London

Monday 06/09/2010 - Wednesday 08/09/2010

09:00-18:00

Description

Mathematical methods for signal processing and in general for data representation and inference are growing more and more sophisticated. Successful applications of such methods range from medical imaging to security. Developments in mathematics and signal processing are however often divergent. Therefore, the main aim of this conference is to bring together signal processing researchers with statisticians and mathematicians working on problems related to data modelling and estimation, to encourage the exchange of ideas as well as to discuss theoretical underpinning of the methods and domains of application.

The INSPIRE 2010 conference will be held at the Anatomy JZ Young LT at University College London from September 6 till September 8, 2010.
Please visit the conference website to register: http://www.network-inspire.org/events/view/3

The Programme is available at the following link: http://www.commsp.ee.ic.ac.uk/~pld/programmeINSPIRE2010.pdf

Contact Professor Sofia Olhede

Tuesday, March 02, 2010

Why Compressed Sensing is NOT a CSI "Enhance" technology ... yet !

The Wired article on Compressed Sensing has started a series of ripples on the interweb that has reached the mighty Slashdot. While I wanted to address this issue a little later, I think it is important to make a point that something in the Wired article has been missed by many readers.

What is described in the Wired article is mostly equivalent to the diverse traditional post-processing steps involved in what is called inpainting. There is a large amount of work in that area but it is just post-processing. In other words, if your hardware has not been able to pick up on details in the first place then it really is nearly impossible to have the kind of superresolution commonly found in CSI and other TV shows (from the video it looks like the problem started with Blade Runner)

The MRI example of the Wired article uses random "pixels" in the Fourier domain,:these pixels are not in the real domain as the Obama picture seems to indicate. This distinction is simply fundamental. For compressed sensing to work, the scene of interest has to be sampled in some incoherent fashion and only specific hardware can do that. Your average digicam is not incoherent (or not incoherent enough) as much money has been spent on lenses to make sure that a point in the real world was translated into a point in the focal plane. In other words, if the scene was acquired coherently (this is why your hard earned cash pays for a spiffy DSLR or a high resolution webcam in the first place) then there is very little in terms of inpainting or superresolution one can do.

If, on the other hand, the lenses of your digicam were to be badly broken, then there would be a chance to obtain superresolution or inpainting and this is why I referred to compressed sensing as a potential CSI technology in a previous entry (Going Hollywood with Compressive Sensing). In that entry, I mentioned the fantastic random lens imager: Here is what one can see with 90 percent "pixels" gone, the 10% of the "pixels" left contain enough information to allow the reconstruction of a sensible image (see reconstructed picture below in the right hand corner):

This is possible because the random lens has mixed different elements of the scene on one pixel and that powerful reconstruction techniques have been used to unmix these components. The whole field of compressed sensing is somehow devoted to defining exactly the randomness in the random lens and many research groups are in some arms race to find the most powerful new reconstruction solvers. To make it real, others are interested in applying all this to actual hardware and there is much competition there as well since a whole slew of Nobel Prizes have gone to imaging devices for the past hundred years.

What would a typical CSI scenario have to do to incorporate compressed sensing and be scientifically sound ?

Well for one, one could imagine a crime scene where a camera has been found and photos were taken by that camera with badly damaged lenses. The next step would be for the CSI team to take that camera and calibrate it by taking many known pictures. Compressed sensing helps there in making sure that they don't spend 200 years performing that calibration by reducing the number of known photos to be used as calibration in the lab. But while the imaging is indeed very appealing, compressed sensing also can be used to find elements in large crowds such as is the case with the Cylons and group testing. Yaniv Erlich's video presentation of the DNA sudoku paper mentioned earlier here also has potential. With some thoughts, we could even think of astounding sensing systems using nature :). All this to say that instead of being negative, compressed sensing can actually help in making scenari a whole lot more sci-fi and grounded in real science at the same time. I know a few accomplished scientists who were raised on Star Wars, Blade Runner or Minority Report.

Why am I writing this ?

I have already crossed the path of clients with CSI like expectations, I would like to contribute to making compressive sensing a reality without undue expectations.
If one of you write scripts for one of these TV shows, I can direct you to a willing academic on these matters or consult on these technical insights for a tiny fee.

Thank you to Jordan Ellenberg for getting people interested in the subject and for a great human interest story as well.

[Update: a continuation of this entry, albeit a little technical, can be found in Surely You Must Be Joking Mr. Screenwriter ]

If you think this blog provides a service, please support it by ordering through the Amazon - Nuit Blanche Reference Store

Liked this entry ? subscribe to the Nuit Blanche feed, there's more where that came from

Monday, March 01, 2010

CS: An update featuring SPAMS, chaos, poolMC and the DNA Sudoku video

A while back, I had made a request to Julien Mairal and his co-authors about the need for their potentially very fast online dictionary learning toolbox to be available for win32 systems as some of us do not do Linux/Unix. Julien came through with his promise to look into it and sent me the following:

Dear Igor,

I have just released a Windows 32bits version of the SPAMS toolbox, which is available athttp://www.di.ens.fr/willow/SPAMS/. I hope that it works ok, I could not test it intensively.

Best regards

Julien

Thank you Julien !

Laurent Duval sent me the following notice for a presentation in french on the use of a chaotic system to build compressed sensing measurement matrices. It will be featured at the french speaking 13e Rencontre du Non Lineaire (10-12 mars 2010) at ESPCI. The presentation is entitled `Compressive Sensing' en utilisant le Chaos par Lei Yu, Jean-Pierre Barbot, Gang Zheng and Hong Sun, présenté par Jean-Pierre Barbot. The abstract reads:

La methode dite de l'acquisition comprimee plus connue sous le vocable anglo-saxon de `Compressive Sensing' est une nouvelle methode qui permet de capturer et de retrouver par la suite un signal echantillonne a des frequences sous Nyquist. Afin de garantir la reconstitution parfaite du signal cette methode requiere la construction d'une matrice dite `sensing' matrice possedant des proprietes d'inversion particulieres. Ici, une construction de cette matrice a l'aide de sequences issues d'un systeme chaotique est proposee et il est prouve que cette matrice verifie avec une ecrasante probabilite (superieure a une construction aleatoire de type Gaussien) les proprietes de reconstruction requises.

Thanks Laurent for the tip!

Raghu Kainkaryam reminded me of how much time I personally waited to see the poolMC paper with the clear impact it will have on experimental pooling design and how it connects to compressive sensing:

Hi Igor,
Thanks for linking to poolMC. It's been a long time coming. Hope to get some valuable feedback from the CS community.
With regards,
Raghu

The paper was featured in last week's longest entry. Check it out. If you recall there was a video by Anna Gilbert showing some preliminary results in the compressive sensing online talks pageand in the compressive sensing hardware page. Thanks Raghu!

While we are on the subject of compressive sensing inspired pooling, here is Yaniv Erlich's video presentation of the DNA sudoku paper mentioned earlier here.

Thursday, December 31, 2009

Thank you !

You know what they say about traditions in Aggieland: "Do something twice and it's a tradition." Mirroring last year's Thank you post, here is this year's. First, in 2009 the readership has increased and has mostly shifted from direct hits on the website to feedreaders.

The tone of the blog has probably changed a little as I mostly show new preprints without discussing them. As it stands, I try to balance the need to have a place where the latest and greatest is shared rapidly throughout the community and my need to digest all this information by trying to understand what is new and what is not. In order to digest this information, y'all are being amazingly kind to me with my dumb questions. I framed these into Q&A's and posted them on the blog, here is the list of the people who kindly obliged in this exercise:

Something new and unexpected happened this past year as well and no it wasn't a list of people interested in compressive sensing on Twitter but rather the ability of the blog to provide some new thrust of thoughts as witnessed in the Ghost Imaging experiment of Ori Katz and Yaron Bromberg leading to the Compressive Ghost Imaging paper or in the inclusion of some thoughts in the review entitled: Why do commercial CT scanners still employ traditional, filtered back-projection for image reconstruction? by Xiaochuan Pan, Emil Sidky and Michael Vannier. As it happens, Richard Gordon the inventor of ART for CT-scanners responded to that review in this entry: MIA'09 and Richard Gordon's ART CT algorithm. I am sure this is the beginning of a long discussion. Thank you Xiaochuan Pan, Emil Sidky and Michael Vannier for getting the ball rolling. I hope that entries featured in the "These technologies do not exist" series will provide similar impetus.

Others have contributed to the blog and made it better as a result, thank you to them. Here is a list (in no particular order):

Also, Thanks to the folks who mentions Nuit Blanche as an acknowledgment in their papers:

Emmanuel Candes and Terry Tao in the Dec. '08 issue of the IEEE Information Theory Society Newsletter
Ori Katz, Yaron Bromberg, and Yaron Silberberg in their Ghost Imaging paper and
Wenlin Gong and Shensheng Han in their paper.

Finally, the Duke Compressive Sensing Workshop contributed to much interest in the topic of compressive sensing this year. Thank you to the organizers of the meeting: Larry Carin, Gregory Arnold, David Brady, Mauro Maggioni, Xiaobai Sun, and Rebecca Willett for making it happen and for putting all the talks on video. The stats of the Compressive Sensing Video page has witnessed a bump since the meeting.

Thank you also to Gabriel Peyré , Laurent Cohen and Frédéric Barbaresco for organizing MIA09.

A big kudos goes to Eva Dyer and Mark Davenport for having attended and regularly updated the Rice Compressive Sensing site.

Thank you, y'all.

Monday, December 14, 2009

CS: This week's long entry

It's Monday, before you gulp that morning cup of joe, you need to read this Fake Steve blog entry first otherwise you might be spilling all that coffee on your new LCD screen. Now that you are back, bada bing bada boom, here is the long post of the week. First, I'll be at MIA'09 and the hashtag for it on twitter will be #MIA09. Gabriel Peyre tells me there will be free wifi. woohoo.

Second, hat tip to the readers of the blog Daryl Lim, Ben Moran and Tanish Agrawal.

Daryl Lim (who incidentally is applying to graduate school in the US in the Fall 2010) let me know the following:

Just letting you know that I found a couple errors in the Cosamp algorithm on Nuit Blanche (cosamp2.m in the MMA14 archive)

Before being fixed, it only worked for signals which were positive (like the test images) as a modulus function was performed in the least squares estimation step (which should not have been performed)

I have fixed it so it works for real valued signals now. Attached is the modified file.

By the way, your blog rocks.

The edited version is called cosamp3.m and is available here. Good call Daryl, now that you have been featured on Nuit Blanche, I am sure people will look at your graduate application with additional interest. By the way, I agree with you, Daryl, Nuit Blanche rocks!

Ben Moran let me know that the python code solving the Sudoku problem with l1-minimization is here. Thanks Ben!

Tanish Agrawal asked a very interesting question (that in my view is not being asked often):

As a part of my research work, I am working on compressed sensing for OFDM signals in UWB. I have been following some papers from Rice University DSP repository.

In particular, I have some doubts regarding the following paper: 'Beyond Nyquist: Efficient Sampling of Sparse Bandlimited Signals' by J. Tropp et al. They discuss about implementing a random demodulator in the paper. In the part where they propose a matrix for the accumulator and dump sampler (page 5), it works by adding consecutive entries for which one needs to have all the samples. So, how is it working at sub-Nyquist rate when we still need to collect all the samples to integrate them?

Could you please take out sometime in helping me understand this?

To what I responded:

The number of eventual measurements, i.e. linear combination of samples is much less than what the Nyquist rate would indicate. I agree with you that the signal is being sampled at a high rate.

The issue that needs to be understood is that there are instrumentation for which acquiring those linear combination of samples is easier than acquiring each and everyone of these samples.

In this case, the random modulator is used to obtain signals that are in the GHz range. While sampling at that high rate (i.e. take each sample and store them) is not currently feasible, it is however possible to flip the polarity of the sensors at a fast enough rate (GHz) to implement the modulator proposed in the paper. In effect, in this instance, Compressive Sensing enables a new capability that could not exist otherwise. I think Tanish's question is a central one in that using the term subsampling or sub-Nyquist is an impediment for understanding that the new "samples" are actually linear combination of "normal" samples and there is nothing magic: We cannot have superresolution with compressive sensing if the hardware does have the capability to "touch" that superresolution in the first place. Because there are few systems where this subsampling is an obvious enabler, we need to think of these new technologies that have not been implemented yet. It is also important to have a living document on the current technologies implementing compressive sensing explaining how different they are from the current state of the art. Thanks Tanish !

And finally, David Smith produced a blog entry entitled: Because it's friday detecting Cylons ... with group testing. Let us recall that group testing is connected to compressive sensing.

Now here is what I found on the interwebs:

Performance analysis for sparse support recovery by Gongguo Tang, Arye Nehorai. The abstract reads:

The performance of estimating the common support for jointly sparse signals based on their projections onto lower-dimensional space is analyzed. Support recovery is formulated as a multiple-hypothesis testing problem. Both upper and lower bounds on the probability of error are derived for general measurement matrices, by using the Chernoff bound and Fano's inequality, respectively. The upper bound shows that the performance is determined by a quantity measuring the measurement matrix incoherence, while the lower bound reveals the importance of the total measurement gain. The lower bound is applied to derive the minimal number of samples needed for accurate direction-of-arrival (DOA) estimation for a sparse representation based algorithm. When applied to Gaussian measurement ensembles, these bounds give necessary and sufficient conditions for a vanishing probability of error for majority realizations of the measurement matrix. Our results offer surprising insights into sparse signal recovery. For example, as far as support recovery is concerned, the well-known bound in Compressive Sensing with the Gaussian measurement matrix is generally not sufficient unless the noise level is low. Our study provides an alternative performance measure, one that is natural and important in practice, for signal recovery in Compressive Sensing and other application areas exploiting signal sparsity.

On Reducing the Complexity of Tone-Reservation Based PAPR Reduction Techniques by Compressive Sensing by Eprahim B. Al-Safadi and Tareq Y. Al-Naffouri. The abstract reads:

In this paper, we describe a novel design of a Peak-to-Average-Power-Ratio (PAPR) reducing system, which exploits the relative temporal sparsity of Orthogonal Frequency Division Multiplexed (OFDM) signals to detect the positions and amplitudes of clipped peaks, by partial observation of their frequency content at the receiver. This approach uses recent advances in reconstruction of sparse signals from rank-deficient projections using convex programming collectively known as compressive sensing. Since previous work in the literature has focused on using the reserved tones as spectral support for optimum peak-reducing signals in the time-domain [5], the complexity at the transmitter was always a problem. In this work, we alternatively use extremely simple peak-reducing signals at the transmitter, then use the reserved tones to detect the peak-reducing signal at the receiver by a convex relaxation of an other-wise combinatorially prohibitive optimization problem. This in effect completely shifts the complexity to the receiver and drastically reduces it from a function of N (the number of subcarriers in the OFDM signal), to a function of m (the number of reserved tones) which is a small subset of N.

On the Exact Space Complexity of Sketching and Streaming Small Norms by Daniel M. Kane, Jelani Nelson, David P. Woodruff. The abstract reads:

We settle the 1-pass space complexity of (1 +- \epsilon)- approximating the Lp norm, for real p with 1 \lt p \lt 2, of a length-n vector updated in a length-m stream with updates to its coordinates. We assume the updates are integers in the range [-M;M]. In particular, we show the space required is O(\epsilon^(-2 )log(mM) + log log(n)) bits. Our result also holds for 0 \lt p \lt 1; although Lp is not a norm in this case, it remains a well-dened function. Our upper bound improves upon previous algorithms of [Indyk, JACM '06] and [Li, SODA '08]. This improvement comes from showing an improved derandomization of the Lp sketch of Indyk by using k-wise independence for small k, as opposed to using the heavy hammer of a generic pseudorandom generator against space-bounded computation such as Nisan's PRG. Our lower bound improves upon previous work of [Alon-Matias-Szegedy, JCSS '99] and [Woodru, SODA '04], and is based on showing a direct sum property for the 1-way communication of the gap-Hamming problem.

and after compressed sensing for vectors, matrices (low-rank), we now have compressed sensing for multilinear forms’ (tensor variables): Multiarray signal processing: tensor decomposition meets compressed sensing by Lek-Heng Lim and Pierre Comon. The abstract reads:

We discuss how recently discovered techniques and tools from compressed sensing can be used in tensor decompositions, with a view towards modeling signals from multiple arrays of multiple sensors. We show that with appropriate bounds on coherence, one could always guarantee the existence and uniqueness of a best rank-r approximation of a tensor. In particular, we obtain a computationally feasible variant of Kruskal’s uniqueness condition with coherence as a proxy for k-rank. We treat sparsest recovery and lowest-rank recovery problems in a uniform fashion by considering Schatten and nuclear norms of tensors of arbitrary order and dictionaries that comprise a continuum of uncountably many atoms.

Lek-Heng Lim made two presentations at NIPS 09, Part I: "Rank aggregation via Hodge theory," and Part II: "Rank aggregation via nuclear norm minimization.

I also found three presentations:

Martin Vetterli, Sparse Sampling: Theory Algorithms and Applications.
Alexander Balinsky, On semi‐supervised learning and sparsity. Applications to the colourisation problem of natural images
Gabriel Peyre, Sparse geometric processing of images, the compressed sensiing part starts at slide 64.

From the Rice repository, I seem to have missed the following:

Modified Compressive Sensing for Real-time Dynamic MR Imaging by Wei Lu, Namrata Vaswani. The asbtract reads:

In this work, we propose algorithms to recursively and causally reconstruct a sequence of natural images from a reduced number of linear projection measurements taken in a domain that is “incoherent” with respect to the image’s sparsity basis (typically wavelet) and demonstrate their application in real-timeMR image reconstruction. For a static version of the above problem, Compressed Sensing (CS) provides a provably exact and computationally efficient solution. But most existing solutions for the actual problem are either offline and non-causal or cannot compute an exact reconstruction (for truly sparse signal sequences), except using as many measurements as those needed for CS. The key idea of our proposed solution (modified-CS) is to design a modification of CS when a part of the support set is known (available from reconstructing the previous image). We demonstrate the exact reconstruction property of modified-CS on full-size image sequences using much fewer measurements than those required for CS. Greatly improved performance over existing work is demonstrated for approximately sparse signals or noisy measurements.

Tuesday, December 08, 2009

CS: YALL1, fitness function Eureqa, Data Driven CS, Matlab, Python, NIPS 2009 papers, Bayesian statistics in AI and Social Sciences

Junfeng Yang, now at Nanjing University, let me know of the newly released paper on the algorithm underlying the YALL1 package, it is: Alternating Direction Algorithms for L_1 Problems in Compressive Sensing by Junfeng Yang and Yin Zhang. The abstract reads:

In this paper, we propose and study the use of alternating direction algorithms for several L_1-norm minimization problems arising from sparse solution recovery in compressive sensing, including the basis pursuit problem, the basis-pursuit denoising problems of both unconstrained and constrained forms, as well as others. We present and investigate two classes of algorithms derived from either the primal or the dual forms of the L_1-problems. The construction of the algorithms consists of two main steps: (1) to reformulate an L_1-problem into one having partially separable objective functions by adding new variables and constraints; and (2) to apply an exact or inexact alternating direction method to the resulting problem. The derived alternating direction algorithms can be regarded as rst-order primal-dual algorithms because both primal and dual variables are updated at each and every iteration. Convergence properties of these algorithms are established or restated when they already exist. Extensive numerical results in comparison with several state-of-the-art algorithms are given to demonstrate that the proposed algorithms are efficient, stable and robust. Moreover, we present numerical results to emphasize two practically important but perhaps overlooked points. One point is that algorithm speed should always be evaluated relative to appropriate solution accuracy; another is that whenever erroneous measurements possibly exist, the L_1-norm fidelity should be the fidelity of choice in compressive sensing.

YALL1 is available in Matlab. As most of you may already know, MATLAB 2009b has a potential serious bug. You can find a description here, but since this a serious problem relating to the use of the transpose sign which is very much used in compressive sensing solvers, let me copy the message here:

There seems to be a serious bug in the current Matlab release R2009b that
occurs when solving a linear system of equations with the transpose. Here
is a small example:

A=[ 0 2 ; 1 0 ];
b=[ 2 ; 0 ];
A'\b

ans =

0
1

whereas the correct solution is [ 0 ; 2 ]. The answer [ 0 ; 1 ] is the
solution of A\b, so it appears that the transpose sign ' is ignored.
However, the correct solution is returned when using

transpose(A)\b, A.'\b

instead of A'\b.

Interestingly, this bug does not occur for all matrices, for instance

A = [1 2 ; 1 1]
A'\b

returns the correct result [ -2 ; 4 ].

The bug seems to be new in release R2009b and applies to Linux and Windows
32 and 64 bit versions. A bug report has been filed.

The Mathworks folks responded here:

Following up to the message in the Nov 30, 2009 NA Digest:

Thanks to all the MATLAB Community members who have reported this bug
and forwarded this information along.

We have posted this Bug (583443) in our Bug Report system:

http://www.mathworks.com/support/bugreports/

Look under MATLAB, in the Mathematics Category, Exists In R2009b.

MATLAB customers should feel free to Watch this Bug for any updates,
or Subscribe to an RSS feed of a group of Bugs of interest.

This Bug Report details the circumstances required to trigger the bug.

The summary is that A must be the permutation of a full, square, non-scalar
triangular matrix (but not exactly triangular) and must be used as A’\b in
the solve.

The Bug Report does not yet reflect this, but the bug will be fixed in our
next release of MATLAB.

But in order to see their bug report, you have to create an account with them. I am not quite sure this sends the right message. I am not an open source fanatic but this little story is a short reminder that little attempts at implementing algorithms in a free language has some potential to survive the test of time. Which brings us to Ben Moran who implemented the solving of a Sudoku puzzle with an L1 minimization approach in Python (as mentioned here) . Ben, pretty please, after that python meeting, please release that code :-)

Georgos Tzagkarakis kindly let me know that, in a previous entry, I confused him with his brother Christos Tzagkarakis who also happen to publish in compressive sensing. This is fixed. As it turns out Georgos has submitted the following to ICASSP '10:

Bayesian Compressed Sensing Imaging using a Gaussian Scale Mixture by Georgos Tzagkarakis and Panagiotis Tsakalides. The abstract reads:

The ease of image storage and transmission in modern applications would be unfeasible without compression, which converts high-resolution images into a relatively small set of significant transform coefficients. Due to the specific content of many real-world images they are highly sparse in an appropriate orthonormal basis. The inherent property of compressed sensing (CS) theory working simultaneously as a sensing and compression protocol, using a small subset of random incoherent projection coefficients, enables a potentially significant reduction in the sampling and computation costs of images favoring its use in real-time applications which do not require an excellent reconstruction performance. In this paper, we develop a Bayesian CS (BCS) approach for obtaining highly sparse representations of images based on a set of noisy CS measurements, where the prior belief that the vector of projection coefficients should be sparse is enforced by fitting directly its prior probability distribution by means of a Gaussian Scale Mixture (GSM). The experimental results show that our proposed method, when compared with norm-based constrained optimization algorithms, maintains the reconstruction performance, in terms of the reconstruction error and the PSNR, while achieving an increased sparsity using much less basis functions.

Thanks Georgos !

In the Sparsity in Everything series, a series of post that highlights sparsity and power laws in odd places for the purpose of shamelessly linking them back to compressive sensing, I mentioned the work of Michael Schmidt and Hod Lipson entitled Distilling Free-Form Natural Laws from Experimental Data that seemed to amount to a solving some sort of l_0 problem ( i.e. fitting the data/phenomena at hand with a combinatorial search using genetic algorithms.) Well they just came out with a GUI that allows anyone to try that experiment for themselves. It's called Eureqa. In Equation search part 1, Andrew Gelman just wrote in a string a data he sent and he's waiting for an answer. He might be waiting for a long time.... In the meantime, I am surprised to see that there is no response to the criticism made on the fitness function of the original article and now thisprogram. The criticism was brought forth here:

On the Article "Distilling Free-Form Natural Laws From Experimental Data" by Christopher Hillar and Friedrich Sommer. The paper starts as:

A recent paper [1] introduced the fascinating idea that natural symbolic laws (such as
Lagrangians and Hamiltonians) could be learned from experimental measurements in a physical system (such as a pendulum). The learning is based on a genetic algorithm that employs a fitness function involving the pair-wise partial derivatives of the time-series measurements of state-space data coming from the system. While the general approach is original and convincing, we feel that the particular algorithm described in the paper cannot accomplish these goals. We present theoretical and experimental arguments for this belief and also explain some mathematically rigorous ideas that can be used within the authors’ framework to find symbolic laws.
We first recapitulate the main steps in the methodology

And while you are waiting for Eureqa to converge (to the wrong solution ?), you might as well do two things:

1. Enjoy reading papers from NIPS 2009. They're out. Andrew Mcgregor points to some streaming papers. While reading some of them and being off-line, I was wondering aloud if, in some exploratory instance, we should not go about the following process to do compressive sensing:

Find out the Power Laws at play in your data.
Taylor an iterative re-weighted sparse signal recovery algorithms that outperform the standard l_1-norm minimization.
Choose the right encoding scheme that fits a specific physical sensor system

and you now have a compressive sensing scheme tailored to your area of interest.

2. Enjoy this discussion between two bayesian people: an AI person and a social science person namely Eliezer Yudkowsky and Andrew Gelman respectively

Finally, don't expect Eureqa to help in your formula for happiness, it's just a fit whereas the cure seems to be to just watch faces in the morning.

Credit photo: Sam Derbyshire, roots of polynomials as shown in John Baez's "The beauty of roots" http://math.ucr.edu/home/baez/roots/. Poster from NIPS 2009, Multilabel Prediction via Compressed Sensing.