Yves Wiaux sent me the following about a meeting on September 4-9, 2011.
The international BASP Frontiers workshop was created to promote synergies between the Astronomical and Biomedical communities, around selected common challenges for signal processing. The 2011 workshop is organized by the EPFL Signal Processing Laboratories and will take place in September [04-09] in a very nice resort in the Swiss Alps [Villars] close to Lausanne and Lake Geneva. It will focus on Fourier Sampling for Magnetic Resonance and Radio Interferometric Imaging.
Do not hesitate to have a look at the website [BASP Frontiers website: http://imxgam.in2p3.fr/BASP2011/], where you will find all required information regarding scientific context, committes, venue, keynote and invited participants [including R. Baraniuk, E. Candes, V. Goyal, T. Strohmer, and P. Vandergheynst], program, and registration [opening April 1]. If you would like to participate, please contact us [BASP email contact: baspfrontiers@epfl.ch] and provide information as detailed on the "invitation request page". Do not hesitate either to disseminate this announcement.
I've been following your blog for a while and it has been very useful for keeping me on top of ongoing research in sparsity. I believe that the following workshop may be of interest to readers of your blog -- Structured Sparsity: Learning and Inference Workshop [1]. It is organized by Francis Bach, Han Liu, Guillaume Obozinski, Eric Xing and myself. The workshop is organized as a part of the 28th International Conference on Machine Learning [2] (June 28 through July 2, 2011). Our invited speakers mainly come from machine learning and statistics community, but we also hope to attract some signal processing folks due to the commitment of prof. Volkan Cevher. Call for contributions can be found here [3].
We would highly appreciate if you could publicize the workshop.
In other news, the Duke University Workshop on Sensing and Analysis of High-Dimensional Data (SAHD) is planned for July 26-28, 2011. The meeting will be held at the Washington Duke Inn and Golf Club, adjacent to the Duke campus. The meeting is being organized and hosted by the following Duke faculty: David Brady, Robert Calderbank, Lawrence Carin, Ingrid Daubechies, David Dunson, Mauro Maggioni and Rebecca Willett.
Since there have been a few. I am listing some of the meetings related to CS on this CS meetings page.
I heard through the grapevines that some people tended to avoid reading Nuit Blanche because it felt like going a conference every single time they read a blog entry. Others even reported fearing their ideas would already be published by other groups or that somebody else's thinking might taint their own thought process. This fear and loathing is absolutely misplaced because it's just the beginning. There is plenty of work and revolutionary ideas ahead of us. Recall what Richard Hamming said: "It's not the consequence that makes a problem important, it is that you have a reasonable attack" (“You and Your Research” --- Richard Hamming (1986) ). Think of the blog as providing a convenient way of finding many other reasonable attacks. In that context, I also like Howard Aiken's quote “Don’t worry about people stealing your ideas. If it’s original, you’ll have to ram it down their throats”. We have had a taste of that already.
Let us look at some numbers to get a sense of the reality of the "competition" and interest on the subject. About six months ago, I asked one of you to provide me with the stats of her site after being featured here. Here is what the Nuit Blanche Effect looked like then:
It may sound like a small number except those are high quality visits but what are the chance they are working in your exact same field, on your exact same problem ? Next, here are the stats of the blog and the compressive sensing big picture page after being featured on Wired:
Finally, here are the stats of this cartoon. Let us mention that the latest increase come from being featured on the CS big picture site (i.e. the constant traffic there about 100 to 200 views per day) which itself is featured on the wikipedia entry:
What are the lessons from scrutinizing these data:
First and foremost, if you have a really revolutionary idea, chances are nobody but yourself is thinking about it or has the drive to implement it.
Second, if you want a subject to be of interest to students and engineers all over the world, make sure yourentry on Wikipedia sticks (by the way I think the current one is by far not optimal). I don't care about Wikipedia enforcing a no-follow tag on every articles of theirs however the traffic from Wikipedia sustains itself over time. If you are working on a subject that is new (or not) I suggest you write an entry on Wikipedia and then link to the CS page. For the senior folks reading this, this means getting your new grad students to make that entry. It will help them, it will help you.
Third, I serendipitously have heard many of you talking to each other, getting papers out or traveling half-way around the world thanks to this blog. If there is something I am really proud of, this is it.
Fourth, I am not sure how to transfer this type of dynamics on the Compressive Sensing LinkedIn group which now has 378 members from industry and academia. I just added the Nuit Blanche RSS feed to the group news.
Finally, for those of you really fearing and loathing reading this blog because it might be the bearer of bad news in that you might be upstaged by somebody else. I have a solution for you: I am going to build an IPhone app featuring only Nuit Blanche as a feed and sell it for $500 on the app store. That way, you can buy it and feel content that the information is on your IPhone but never open it so that you can keep your hopes up that you have not been upstaged: A situation not unlike that of high priced gym memberships.
P.S.: You are bored and you want to see some action? what about changing the entry on linear system of equations in wikipedia to reflect the new findings of compressive sensing and wait for the "Everyone knows this is impossible" slam down quote.
Because it works OK for the task at hand. We all know that coded aperture done with a compressive sensing twist would lead to superresolution and 3D capabilities but:
the reconstruction would be much slower
superresolution may not be needed
there is a fear that a nonlinear reconstruction process would not fail gracefully.
Maybe this is just true, maybe what we are looking for is simple enough that having an OK measurement is better than the best possible reconstruction with fewer measurements. Then again, if one wants to use the data for other purposes like labeling and so forth, compressive sensing is the only framework that provides some guarantee that you have acquired all the information, not some part of it. Is that zero-th or first order, I am thinking about a round number, and you ?
If you watch the video all the way, Ramesh is looking forward to being able to label everything as well, can one do that with not enough information ?
Responding to a request I made yesterday, one of the reader of this blog kindly sent me an invitation for Google Wave. However it looks like Google has a waiting list even for that so I have not received anything. Google, when a party sends an invitation, what is really important is not the sending, it is the receiving of that invitation by the second party that makes it an invitation. The process reminds me of the rental reservation process as experienced by Seinfeld.
Inline digital holograms are classically reconstructed using linear operators to model diraction. It has long been recognized that such reconstruction operators do not invert the hologram formation operator. Classical linear reconstructions yield images with artifacts such as distortions near the field-of-view boundaries or twin-images. When objects located at dierent depths are reconstructed from a hologram, in-focus and out-of-focus images of all objects superimpose upon each other. Additional processing, such as maximum-of-focus detection, is thus unavoidable for any successful use of the reconstructed volume. In this letter, we consider inverting the hologram formation model in Bayesian framework. We suggest the use of a sparsity-promoting prior, verified in many inline holography applications, and present a simple iterative algorithm for 3D object reconstruction under sparsity and positivity constraints. Preliminary results with both simulated and experimental holograms are highly promising.
Many theories of neural networks assume rules of connection between pairs of neurons that are based on their cell types or functional properties. It is finally becoming feasible to test such pairwise models of connectivity, due to emerging advances in neuroanatomical techniques. One method will be to measure the functional properties of connected pairs of neurons, sparsely sampling pairs from many specimens. Another method will be to find a ‘‘connectome,’’ a dense map of all connections in a single specimen, and infer functional properties of neurons through computational analysis. For the latter method, the most exciting prospect would be to decode the memories that are hypothesized to be stored in connectomes.
In random order here are a list of items that have been started or are in my mind:
I want to start a series of post entitled "These Technologies Do Not Exist" which will feature technologies that do not exist yet and which could be enabled by compressive sensing if they were investigated in-depth. The idea is that most funding agencies do fund mostly incremental technologies and that sometimes there is a need for researchers to go beyond being part of the 10% improvement business. If you have one of those ideas and feel it is OK to give it to the world, then by all means contact me for inclusion in the series.
Twitter has created a list feature recently. I put together a list of people that have declared an interest in Compressive Sensing, have read the blog or even published in the field, it is at:
Please realize this is not a list of people talking about compressive sensing all the time, rather a community of like minded spirits.
I want to expand a little bit the Big Picture page as they are clearly some areas that are more vigorous than others, I am thinking about including a blurb about:
If anybody wants to help, you'll be duly credited in the page on your effort as usual.
I am considering starting a project dedicated to producing a compressive sensing EEG system for the purpose of doing very cheap BCI (Brain-Computer Interface). For that I think I need a Google wave account and have the ability to invite people (have Google Wave invites as well) since otherwise the concept is pretty moot. For the Googlers reading this blog and who have the means to make this happen, it's a hint.
Finally, because of different circumstances I am thinking aloud about starting a company dealing with incoherent systems which you might guess rightly has something to do with the subject area covered here.
I don't know if I would qualify as a specialist but I sense that the throughput in compressive sensing work is decreasing somehow. Let me explain, in the past two years, we have gotten better ways of describing properties for encoding matrices (null space property, RIP,.....), some new concepts of encoding matrices seem to be a better fit to the physics of interest to some engineering fields, we have more domain specific dictionaries while the reconstruction solvers are advertized to be increasingly better. In the past few months, the number of papers direcly related to these issues has decreased. At the same time, we have also seen extensions of the field in different directions: there is some maturing of hardware concepts and new ones are appearing based on the properties mentioned above. We also see concepts and techniques developed in CS but applied to other domains such as matrix completion as well as generalizations of CS in different directions (nonlinear, ....). The recent call by Muthu to define more precisely CS is a sign that indeed the subject is morphing into something wider. Indeed, there is no reason to expect the contrary as all engineering fields are drowning under linear underdetermined systems. In fact, I don't see a trend away from looking for few variables of interest in large and complex models.
Looking to the future
At a different level, if one removes the issue of reconstruction (which could be construed as a de facto proof that you guessed the right dimensionality of the system) the random projection technique gives way to a whole new world of manifold signal processing and the rise of domain specific sensors. Is the world, besides the military, ready for this type of market ? Only time will tell, I am personaly looking forward to how the issue of calibration will be investigated. Eventually, this and other areas of interest will be borne out of compressive sensing but they will hardly be recognizable as such and This Is A Good Thing (TM).
The direction of this blog
Writing this blog takes a personal toll in the form of having to read Other People's Ideas (OPI), but there is a hope that it remove situations where false dichotomies get spread too far in the system and eventually yield non-important but self-sustaining concepts. The other toll is that the average reader may feel overwhelmed. The recent poll I made last week (see graph above), is hardly statistically relevant but provides a trend that the blog is probably giving too much information about the new branches of CS without providing much context. From talking to some of you, there is also an element of weariness in traditional CS about the claims made on the solvers and other encoding approaches. For instance with regards to solvers, there is a sense that much parameter guessing goes into getting the results of the underlying publication. While, one way to avoid this situation is to make one's code available for people to use, another way is to probably make sure that the same benchmark exercise a la Lena is available to authors. Yet this is rarely performed even though Sparco is there for that purpose. Closer to home, another way to answer the sense of loss of directions expressed by the poll is to provide more context by going back to these interviews pieces of asking dumb questions to real experts in the field. While I recognize the limit of the exercise, any other insights on how to do this effectively are welcome.
And now to our regularly scheduled broadcast, we have two papers and an CfP for a publication:
We analyze the asymptotic performance of sparse signal recovery from noisy measurements. In particular, we generalize some of the existing results for the Gaussian case to subgaussian and other ensembles. An achievable result is presented for the linear sparsity regime. A converse on the number of required measurements in the sub-linear regime is also presented, which cover many of the widely used measurement ensembles. Our converse idea makes use of a correspondence between compressed sensing ideas and compound channels in information theory
From the Rice Repository we have an older paper I had not covered before:
In this paper, we present a Wyner-Ziv coding based on random projections for image compression with side information at the decoder. The proposed coder consists of random projections (RPs), nested scalar quantization (NSQ), and Slepian-Wolf coding (SWC). Most of natural images are compressible or sparse in the sense that they are well-approximated by a linear combination of a few coefficients taken from a known basis, e.g., FFT or Wavelet basis. Recent results show that it is surprisingly possible to reconstruct compressible signal to within very high accuracy from limited random projections by solving a simple convex optimization program. Nested quantization provides a practical scheme for lossy source coding with side information at the decoder to achieve further compression. SWC is lossless source coding with side information at the decoder. In this paper, ideal SWC is assumed, thus rates are conditional entropies of NSQ quantization indices. Recently theoretical analysis shows that for the quadratic Gaussian case and at high rate, NSQ with ideal SWC performs the same as conventional entropy-coded quantization with side information available at both the encoder and decoder. We note that the measurements of random projects for a natural large-size image can behave like Gaussian random variables because most of random measurement matrices behave like Gaussian ones if their sizes are large. Hence, by combining random projections with NSQ and SWC, the tradeoff between compression rate and distortion will be improved. Simulation results support the proposed joint codec design and demonstrate considerable performance of the proposed compression systems.
Distributed camera networks have applications in diverse areas, including urban surveillance, environmental monitoring, healthcare, and battlefield visualization. Although distributed camera networks have been increasing in numbers, effective methods for dissemination, processing and understanding of video data collected by distributed camera networks are lacking. There is a strong need for developing distributed algorithms for compression and dissemination of video data, detection and tracking of moving objects, motion capture and higher level tasks such as visualization, recognition of objects and the activities they are involved in. Most of the existing algorithms for these tasks operate in a centralized manner---imagery or other information such as location and identity are transmitted to a server that performs necessary processing. Such server-oriented architectures do not scale for the problems mentioned above. Another feature of distributed camera networks is that the use of distributed algorithms adds network bandwidth and power to the mix of constraints; those constraints are particularly tight for wireless networks. Algorithms may need to be redesigned to meet these requirements---simple mapping onto embedded platforms is often not sufficient. Manuscripts are solicited to address a wide range of topics in distributed camera networks, including but not limited to the following
List of Specific Topics Covered:
Sensing
Collaborative (in-network) processing
Compressive sensing and sparse representation
Adaptive sensing (e.g., using different spatial/time resolutions, bit depths, etc.)
Processing
Distributed camera calibration
Distributed video compression Efficient video transmission Detection and tracking of objects
Recognition of identity and events
Visualization
Communication Network architectures Efficient protocols
Secure transmission
Camera scheduling
Cross-layer protocols Computing Embedded systems
Low-power computing
Software protocols Privacy protection
Timeline for Submission, Review, and Publication:
Manuscript submission: 30, August 2009
Preliminary results: 15, November 2009
Revised version: 15, January 2010
Notification: 15, February 2010
Final manuscripts due: 15, March 2010
Anticipated publication: 01, June 2010
List of Guest Editors: Prof. Rama Chellappa, University of Maryland Prof. Wendi Heinzelman, University of Rochester Prof. Janusz Konrad, Boston University Prof. Wayne Wolf, Georgia Institute of Technology
Will someone ... explain to the world what is NOT compressed sensing. It is NOT
Any sparse representation with small number of nonzeros.
Any dimensionality reduction by random projections.
Any operation that replaces L_0 optimization by L_1.
Any heavy hitters algorithm.
I do not want to listen to so many machine learning, signal processing, data mining, statistics, image processing researchers tell me that they do compressed sensing. Or may be I should throw up my hands and let the world embrace Compressed Sensing, after all they are sensing something, and it is compressed.
One of the anonymous commenter wrote:
You should explain why it matters that things are mislabeled. Do Compressed Sensing techniques apply to the things you consider *not* to be Compressed Sensing? If so, what is the harm in mislabeling? Is the harm that techniques are incorrectly applied?
I don't think this is a question of applying techniques incorrectly. I was about to write a longer post but then went ahead and discussed the subject with one of you on how s/he and I interpreted what CS was and what was not CS. After a fruitful discussion, we agreed to disagree that there was no easy definition except that CS had provided a mathematically strong framework for totally new approaches to sampling and to other fields. It doesn't really mean that everything goes, rather that everybody has a different appreciation of where this is leading and what is the most important problem is.
In a different field, I am sure that Haar wavelets were not considered real wavelets by the leaders of the field. but they have nonetheless allowed specialists from several fields to speak the same language and normalized their approaches.
The effect of a field becoming popular has happened before. By serendipity I found, the following 1956 paper by Claude Shannon himself bemoaning the use of the term Information Theory.
A study of the use of the word "Information Theory" on Google timeline since 1956 seems to show that the best policy is to probably "throw up one's hands and let the world embrace the subject.":-)
In recent work, we studied the problem of causally reconstructing time sequences of spatially sparse signals, with unknown and slow time-varying sparsity patterns, from a limited number of linear “incoherent” measurements. We proposed a solution called Kalman Filtered Compressed Sensing (KF-CS). The key idea is to run a reduced order KF only for the current signal’s estimated nonzero coefficients’ set, while performing CS on the Kalman filtering error to estimate new additions, if any, to the set. KF may be replaced by Least Squares (LS) estimation and we call the resulting algorithm LS-CS. In this work, (a) we bound the error in performing CS on the LS error and (b) we obtain the conditions under which the KF-CS (or LS-CS) estimate converges to that of a genie-aided KF (or LS), i.e. the KF (or LS) which knows the true nonzero sets.
In recent work, Kalman Filtered Compressed Sensing (KF-CS) was proposed to causally reconstruct time sequences of sparse signals, from a limited number of “incoherent” measurements. In this work, we develop the KF-CS idea for causal reconstruction of medical image sequences from MR data. This is the first real application of KF-CS and is considerably more difficult than simulation data for a number of reasons, for example, the measurement matrix for MR is not as “incoherent” and the images are only compressible (not sparse). Greatly improved reconstruction results (as compared to CS and its recent modifications) on reconstructing cardiac and brain image sequences from dynamic MR data are shown.
In this paper we study the compressed sensing problem of recovering a sparse signal from a system of underdetermined linear equations when we have prior information about the probability of each entry of the unknown signal being nonzero. In particular, we focus on a model where the entries of the unknown vector fall into two sets, each with a different probability of being nonzero. We propose a weighted $\ell_1$ minimization recovery algorithm and analyze its performance using a Grassman angle approach. We compute explicitly the relationship between the system parameters (the weights, the number of measurements, the size of the two sets, the probabilities of being non-zero) so that an iid random Gaussian measurement matrix along with weighted $\ell_1$ minimization recovers almost all such sparse signals with overwhelming probability as the problem dimension increases. This allows us to compute the optimal weights. We also provide simulations to demonstrate the advantages of the method over conventional $\ell_1$ optimization.
I note the following interesting bit that gives some context into how this new approach is different:
The analysis uses the high-dimensional geometry techniques first introduced by Donoho and Tanner [1], [3] (e.g., Grassman angles) to obtain sharp thresholds for compressed sensing. However, rather than use the neighborliness condition used in [1], [3], we find it more convenient to use the null space characterization of Xu and Hassibi [4], [13]. The resulting Grassmanian manifold approach is a general framework for incorporating additional factors into compressed sensing: in [4] it was used to incorporate measurement noise; here it is used to incorporate prior information and weighted ℓ1 optimization. Our analytic results allow us to compute the optimal weights for any p1, p2, n1, n2. We also provide simulation results to show the advantages of the weighted method over standard ℓ1 minimization.
On a different note, Dirk Lorenz, a reader of this blog, kindly emailed me the following:
...when I checked your entry on "A Coordinate Gradient Descent Method for l_1-regularized Convex Minimization" on Nuit Blanche, I started to check the paper in detail. Well, I do not agree on your remark that it seems that CGD is faster than any other method. At least from their tables, they report always a quite large objective value for CGD. I guess that other methods may be much faster when solving the problem just to that accuracy.
After some back and forth e-mail discussion, he and I agree that the examples shown in the paper show an anedoctal fact. The CGD solver seems to be going much faster than some of the fastest reconstruction codes like GPSR_BB and FPC for the examples featuring a compressed sensing example (partial fourier and gaussian matrix, table 1-4). In some case, it is 10 times faster. At the same time, the same solver seems to be doing worse that the others for the blurring example. As I said, it is anecdotal, but for somebody who sees through the prism of CS, it sure is intriguing. I realize that mathematicians have other stronger standards. Thanks Dirk for pointing this out!
On a related note, it's a hunch but I would not be overly surprised when we do have CS imagers that they would have an advantage over wavefront coding imaging systems since the mixing in Compressive Sensing imagers would be more like the gaussian mixing mentioned above than a specific blurring kernel instatiated by the cubic-PM phase mask as used in wavefront coding. But then again I am not specialist like Ramesh.
Compressed Sensing and the related recently introduced Smashed Filter are novel signal processing methods, which allow for low-complexity parameter estimation by projecting the signal under analysis on a random subspace. In this paper the Smashed Filter of Davenport et al. is applied to a principal problem of digital communications: pilot-based time offset and frequency offset estimation. An application, motivated by current Cognitive Radio research, is wide-band detection of a narrowband signal, e.g. to synchronize terminals without prior channel or frequency allocation. Smashed Filter estimation and maximum likelihood-based, uncompressed estimation for a signal corrupted by additive white Gaussian noise (Matched Filter estimation) are compared. Smashed Filtering adds a degree of freedom to signal detection and estimation problems, which effectively allows to trade signal-to-noise ratio against processing bandwidth for arbitrary signals.
Other reports from the Karlsruhe group can be found here. Cognitive Radios have been added to the Compressive Sensing Hardware page even though I do not know of specific hardware implementation. Let us hope it find its way as a module of some sort in the GNU Radio project. If you are interested in Cognitive Radios, Compressive Sensing and Sensor Networks, there is an announcement for a Postdoc position at Computer Science & Engineering, Electrical Engineering, Mathematics & Statistics at University of Vigo (Doctoral University) in Spain. More can be found on the Compressive Sensing Jobs pages.
The Bayesian Compressive Sensing page has updated another newer version of TSW-CS: It can be downloaded at: tswcs.zip (Last Updated: Jan. 19, 2009). It would not hurt if there was a version number associated with each of these releases. Again all these solvers are listed in the reconstruction section of the Big Picture.
As some of you know, I am interested in the real impact of this blog. One of the ways to do this is through an evaluation of "clean" statistics i.e. statistics showing how this site (and no other site) bring folks to read a specific paper or download an application. About two weeks ago, Mark Iwen made available the code for his Ann Arbor Fast Fourier Transform i.e. "a Fast Fourier Transform (FFT) method for frequency sparse functions. Recovers the k most energetic frequencies hidden in a bandwidth-N function in k*polylog(N) time."
This is an important algorithm. The result of this "clean trial" yielded the following stats (see above) in a week when most people have not gotten back to school yet (it was a low stats period for the blog too): about 50 hits the first day and a sustained 20 hits per day for five days. As usual, the amount in the tail roughly counts cumulatively as much as the first day. I think those numbers should be higher in a higher traffic period.
The number of people reading this blog through an RSS feed is in the realms of 240 people. This is a little less than the number of people coming directly to the site everyday (255/day), and while this two sets overlap, the intersection of these two sets is not a large numbercompared to either sets since most of the folks coming to the site do so after a search of Google or through wikipedia.
Another stat caught my eyes in the large increase of people coming to check the videos / online talks. A five fold increase in eight months and a near two-fold increase in December thanks to the link put on the Rice repository. I think it says a lot about the future of information sharing even in the intellectual realm. Next time, you organize a workshop or a conference, you may want to think about putting some monies for the video production!
Right now, the latest videos are on the top but I am not sure it is a good way of putting these together as some of the earlier talks at IMA are still very important for newcomers to watch first. I should probably put more of them inside the big picture and the other static pages. Other statistics numbers are provided by Youtube for every videos. I have looked at a small sample of them and it looks like when they are featured on the site, one can expect an average 160 people to watch it. To the students who went through the pain of producing a video on the subject, please, pretty please, do make sure your information on Youtube link back to your website.
To get a sense of the traffic: the traffic on the video site is currently one fifth that of the big picture and one tenth that of the blog, but it is also picking up in speed.
Following up on the previous entry, I asked Dror Baron some questions on how the matrices used in that paper compared to other sparse measurement matrices:
In your paper, you mention "similar matrices were also proposed for CS by Berinde and Indyk [34] (as featured in this wiki).", can you enlighten me about the actual differences between the two approaches ? Piotr also indicated to me that their table featured here contained all the bounds for sparse recovery. How do your results fit in that table ?
Dror responded with:
...First, let me tell you about LDPC codes. These channel codes come very close to approaching Shannon's channel capacity. They have had tremendous theoretical and practical impact, and are used in all kinds of wireless systems. The goal is to reliably communicate a message, which consists of bits, from the encoder to the decoder. The encoder side multiples the input (a bit-vector) by a binary matrix that is dominated by zeros and has a small percentage of ones (over the binary field, only 0 and 1 values are allowed). The sparsity of the matrix is very useful for fast encoding. In the decoder, LDPC codes construct a bipartite graph that relates the "measurements" (channel output) and estimated transmitted bits. Messages are iteratively relayed between the two parts of the graph, where each message corresponds to the marginal distribution of one of the nodes. This technique is called "belief propagation" and used in our current work.
My research at Rice was heavily influenced by information theory:
1. Distributed CS: our models contain information theoretic concepts such as "sparsity rate" (the number of large coefficients scales with the size of the problem) and "measurement rate" (number of measurements also scales linearly).
2. CS reconstruction: we identified LDPC codes as something that we could emulate for CS. First we introduced "sudocodes" for decoding strictly sparse signals from noiseless measurements; in that paper, the CS encoding matrix is an LDPC matrix. Next, in the current paper (preliminary results appeared in a technical report from 2006) we consider more general signal and noise models and use belief propagation for decoding.
3. Bounds on the number of measurements ("Measurements vs bits" paper, [Note from igor, the video presenting this work is here]) use the source-channel separation principle of information theory.
Indyk-Berinde also used an LDPC encoding matrix, but the decoding technique is different. Their papers in early 2008 use ell_1, their more recent papers use other techniques too. Wainwright, Wei, and Ramchandran also use similar encoding matrices. These matrices may have poor RIP performance, yet they nonetheless reconstruct well and can be used in very fast algorithms.
d. Update time: each column contains R=O(logN) nonzero entries, and so an
update is O(logN) (good).
e. Recovery: N*log^2(N) (excellent)
f. Approx: I don't know if I can make such statements. Our Bayesian inference is not a geometric approach. That said, there might be geometric properties, because the Gaussian priors imply quadratic metrics. But a serious answer would require careful consideration.
Note that I believe that R=O(log(N/K)) can be used in our paper, thus providing (b) length K*log(N/K) (excellent); (c) encoder N*log(N/K) (excellent); and (d) update R=O(log(N/K)) (excellent).
My main comment related to the comparison between deterministic and random settings.
Take a look at our Figure 4 and how our technique reconstructs better than ell_1 and cosamp. Reasonable people use the information they have at their disposal to improve their performance. Deterministic isn't good if its reconstruction quality is poor. Our Bayesian ("random") approach may do a bit worse if it uses a wrong model (see our Figure 6). A broader perspective is important, there is no single answer to this. When you have statistical information about your signal you'll seriously consider a Bayesian approach, and our algorithm is good under that category.
I then pressed with:
You said: "...These matrices may have poor RIP performance, yet they nonetheless reconstruct well and can be used in very fast algorithms...". In the work of Berinde, Gilbert, Indyk, Karloff and Strauss, they define the RIP-1 condition as opposed to RIP-2 for these sparse matrices. Irrespective to having this property, I am personnaly wondering, how the Juditsky-Nemirovski (JN) condition fares in providing some measure on how good those matrices are.
Dror Kindly responded with:
...It seems that they are examining properties for a matrix to be good with ell_1 reconstruction. The ell_1 was the first breakthrough in CS recovery but not necessarily the best way to do things. While it is important to develop an understanding of why ell_1 works, it is even more important to develop algorithms that work even better. For example, I like papers like Wakin, Boyd and Candes (weighted ell_1) it gives better MMSE performance with less computation. Of course within the Bayesian setting I like our CS-BP too...
While 2008 is about to shut down, I wanted to say thank you for your help. For the past two years, an increasing number of people have contributed to this blog. Direct contributors have included:
and enriched it wonderfully. Some of them even created their own research blog, this is outstanding!
Since I am slow, my understanding of some of the CS papers or some subject areas could not have occured were it not for a kind person to answer it. I got lucky often and most of the time, an e-mail conversation ensued that I eventually felt was important for everybody else to read. The following people kindly ok'ed our offline discussions into a full post or some part of it:
As a matter of general policy, if you are having a detailed technical e-mail discussion with me, I generally ask if it's OK to publish it afterwards. I usually go through some proof-reading and remove non technical items. Eventually the final draft is OK'ed by you.
Much of the referal traffic to this blog was not self generated! Terry Tao's, Andrew Gelman's and John Langford's blogs provide much of that traffic. Thank you guys.
While I see less of this nowadays, I still feel strongly that as the author of your own paper, you should go through the pain of building a website which can take about 5 minutes using Google Sites these days. If you go through the effort of giving additional online information or adding your code, I generally try to make sure you have the maximum exposure as a result. I always prefer reproducible research results to hard to reproduce papers. Please also consider that even if an algorithm looks easy for you to implement, it may not be the case for most. In particular I intend on remaining slow and dumb for the continuing year (no good resolution for 2009 on that front).
I also say it is a pain for you to have a greater web presence because most of you are currently not being judged by your peers on these "details". You eventually will! if not by your peers, you will by the engineers in companies that want to do business with you either as a future employee or a consultant. There is no better time to start than today.
Recently, I had a discussion with one of you on the non-availability of the preprints on his site. The e-mail discussion that ensued revealed that the lab policy had not be updated in a long time. The folks at Sherpa provide you with a database for checking on your ability to perform self-archiving with specific publisher's copyright policies. Please use this tool as opposed to relying on an outdated policies or information.
As an aside, if you believe this blog provides a valuable utility consider donating to the webcrawler engine that makes it all possible at watchthatpage. If you are using this service for other purposes please consider donating money to them as well. The paypal system accepts credit cards and additional information can be added in a message form where you might want to let them know about their huge contribution to the Nuit Blanche blog. The kids implementing this tool told me that it is a typical start-up with two guys working in a dorm. Again, this blog and the community of readers would not be what it is without them, please sparse a buck or twenty or a hundred, for some of you think of it as just a business expense. I do not own stock in that venture.
Last but not least, the folks behind the codes and websites providing a resource to the CS community should be commended for doing an extraodinary job. They are listed in the Compressive Sensing 2.0 page. In particular, a big kudos goes to Mark Davenport for attending and regularly updating the Rice Compressive Sensing site.
To the ones I forgot to list, please forgive me and do tell me. To all, thank you and Happy New Year. Enjoy watching the finale of Sidney's fireworks.
Gerry Skinner is a specialist of Coded Aperture Imaging as it relates to astronomical observations of X-ray events in the sky. Because of the peculiar title of his insightful paper entitled Coded Mask Imagers: when to use them, and when not ? I decided to contact him and ask him if he knew about Compressive Sensing. It is very rare to have a specialist tell you that the technology he has worked on for a while is not the technology you should use! Here a small back and forth discussions we had. Gerry gave me the permission to publish this on the blog. As an aside, I did not know about the group testing fertilizer problem. As I mentioned before, it might be wise to view this conversation as a way for the Compressive Sensing community to clearly articulate how and when coded aperture should be used as opposed to direct systems. After having introduced the subject to him through by pointing to the paper of Roummel Marcia and Rebecca Willett [1], here is the discussion that ensued:
Gerry responded to my initial query with:
I was aware of compressed sensing only in very general terms and, interested by your mail, I have now done a little bit of browsing around on the subject. There are clearly links between coded mask imaging and compressive sensing, though really because both are members of a wider class of techniques (though I suppose you could apply the 'compressive sensing' to that whole class). I have long been very aware of the links of coded mask imaging with other indirect measurement techniques, for example, radio and optical interferometric imaging, Fourier transform spectroscopy and synthetic aperture radar (also radars involving transmitting a coded string of pulses). Further afield I have noted analogies with 'experiment design' problems - for example if you want to examine the effects of different fertilizers and crop treatments, instead of dedicating one part of your field to each, one can apply carefully chosen combinations.
Sometimes you are obliged to measure something other than what you would really like to get to. Other times there is an advantage in doing so. Coded mask imaging is used for a combination of these reasons. In many cases in high energy astronomy you cant build a focusing imaging system that would record directly the image that you would like to get (at present that is still the case for gamma-rays). There is an alternative - a pinhole camera. In high energy astronomy, compared with that, but not compared with a hypothetical focusing system, a coded mask system usually wins.
Even if you are forced to non-focusing optics, it is important to think about when and why coded mask imaging wins. Apart from some very specific cases with few photons, it only does so if there is a detector background noise that doesnt increase as you let more and more photons in from the source. In space this is often the case because of high energy particles interacting in the detector. On the ground, attempts to use coded mask for medical imaging tend to find that there is no advantage to be obtained. A parallel situation occurs with Fourier Transform Spectroscopy. In the far infrared ITS offers an advantage over a scanning spectrometer IR because the thermal noise in the detector remains the same even if you are making measurements of many wavelengths at the same time. It doesnt help in the visible because there the detector noise can be mainly photon noise (shot noise on the signal).
That is why, despite having worked on them for 30 years, I spend a lot of time trying to persuade people not to use coded mask telescopes, or at least not to assume they can do magic in all circumstances (it is one of the reasons that I would like to develop gamma-ray lenses). Similarly with any indirect measurement technique you have to examine very carefully what you are gaining.
As regards the specific questions in your mail :
The data from two major space missions that are currently in orbit and that use coded masks, along with software to allow their analysis, are public. I refer to NASA's Swift mission (http://heasarc.gsfc.nasa.gov/docs/swift/archive/) and ESA's Integral http://isdc.unige.ch/index.cgi?Data+info Getting into analysis of complex data such as is produced by these missions is not trivial, though. The file formats and data analysis systems are both are based on the same FITS file format, but are not the same for the two missions .
Because the instrument design is such that cross-correlation algorithms are close to the best one can do, most of the standard software depends on this. Non-linear techniques have been used (particularly ones based on likelihood and on maximum entropy) and for Integral some software is available to do this. Personally I have never found seen any advantages that outweigh the problems in quantitative interpretation of the results obtained that result from the implicit non-linearity.
Recoiling from his observations, I then asked:
I have seen you reference Roberto Accorsi in a previous paper [And mentioned in this blog here], is this affirmation (on the ground...) coming out of discussions with him ?
Gerry kindly responded with:
I have had discussions with him recently but also with quite a number of other people in medical and industrial fields in several countries over a good many years. Even where there has been initial enthusiasm, there has been an eventual realization that coded aperture techniques offer them no real advantage. In addition to the all-important background noise issue that I mentioned, there is also a question of how the noise is distributed over the image (or equivalent space) after the reconstruction process. Still talking in imaging terminology, with a one-to-one correspondence between measurements and final image pixels there is less effect of a bright region on low brightness ones than with, for example, a coded mask imager which spreads noise more evenly over the image. Unless you are dominated by signal-independent detector noise, the image noise maybe lower on average, but it tends to be higher for bright regions. This is often the opposite of what physicians want.
After I acknowledged the following:
A parallel situation occurs with Fourier Transform Spectroscopy. In the far infrared ITS offers an advantage over a scanning spectrometer IR because the thermal noise in the detector remains the same even if you are making measurements of many wavelengths at the same time. It doesn't help in the visible because there the detector noise can be mainly photon noise (shot noise on the signal).
Gerry further made the point:
This issue is crucial. You have to be able to explain where and why a proposed indirect technique offers an advantage.
A long time ago I developed something very similar to the one-pixel camera. We placed a rotating disk in front of a non-imaging Indium Antimonide IR detector (the only sort available at that time). It was arranged to place a series of near-orthogonal coded-mask-like patterns in front of the detector. Although it worked in general terms and we demonstrated it on a telescope in the Canary Isles, before it was fully debugged pixellated IR detectors developed for the military started becoming available and the incentive for the work disappeared - they were bound to do better. Have you found and read Martin Harwit's book on "Hadamard Transform Optics" written at about the same time (1979) as the above work and containing the same ideas (and more) ?...
.... One of the incentives for us in using these techniques ( Likelihood and Maximum Entropy [see refs. [3, 4, 5]]) is that they can properly handle Poisson statistics where least-square techniques do not. As a by product of this they normally implicitly introduce a positivity constraint that some people like on the basis that sky intensities cannot be negative (though I find the associated non-linearity problematic). And they can be configured to use whatever prior information you have about the scene (provided you can formulate what it is that you think you know).
Let us note that Christopher Brown made a similar comment at the end of his thesis entitled "Multiplexing and Random Arrays" back in 1972 [2, p. 132]
Multiplexing can only be recommended with care, since the advantages it provides with signal-independent noise can be canceled or overcome by signal dependent noise. Detailed analysis and comparison with a competitive non multiplexing system revealed that multiplexing would often be advantageous, even discounting nonlinear effect in the film. The analysis also showed the amount of nonlinear advantage which would overcome disadvantages produced by multiplexing.
There are many themes that have been picked up on in CS already and I am sure more most of these themes can be made clearer across different engineering fields. The arrival of wavelets in the numerical analysis world provided some common ground and eventually advances in very different techniques and engineering fields. I can clearly see CS helping us understand better indirect systems and lay better theoretical bounds while providing additional data and means of discovery. Right now, one can only hope to tackle the different CS hardware implementations as is currently done for coded mask telescopes[ 6].
Gerry eventually added:
The only thing that I would add is that in experiment design often the problem is assumed to be linear, or described by a variance/co-variance matrix, and the example I gave [fertilizer problem ] may not be well chosen. A couple of references from the vast literature on this subject are
Credit photo: NASA/JPL/Space Science Institute, the closest photo of Enceladus ever. The image was taken with the Cassini spacecraft narrow-angle camera on Aug. 11, 2008, a distance of approximately 1,288 kilometers (800 miles) above the surface of Enceladus. Image scale is approximately 10 meters (33 feet) per pixel.