CMSC 733 Pictorial Information Processing

posted in: Fall2013, Graduate | 0

Prof. David Jacobs is so nice so just take it! The final exam is closed-book and really tough, but if you complete all the homework and extra credits, you will be fine. Reviews from

Good professor, good class. It’s not so much about image processing as it is about computer vision, which turns out to be much more interesting. Dr. Jacobs is very familiar with the material and makes it easy to understand. The coursework is not very hard, but Matlab can be frustrating sometimes.


Good professor, definitely knows his material. Goes a bit fast, so if you can’t keep up, definitely ask questions – he likes questions.

This is not the course page, the course page is at

There are 5 homework in total in Fall 2013. All the homework is confidential but I can provide if you have a good reason. Here are some of what I did in this class:


To know what is where, by looking.                                                                  -Marr.

  • Psychology 50% of cerebral cortex is for vision; vision is how we experience the world.
  • Engineering
  • Vision is inferential: Light (shadow); Geometry (cylinder vs. cube); Prior knowledge (Hill vs. meteor Crater, USD, light comes from left)
  • Inference -> Computation; Building machines that see; applying computation to images; Modeling biological perception

Scope of CMSC733

  • Breadth: familiarity with all areas of computer vision
  • Depth: Intuitive understanding of fundamental principles of vision


My mind map for an overview of this course:



  • Fourier transforms, convolution (linear shift invariant) Y=A_c I = R_2 diag(1) R_1 I //R_1 is Fourier Transformation; R_2 is inverse
  • Image denoising – linear
  • Image denoising – nonlinear
  • Canny edge detector
  • Grab Cut: Interactive Foreground Extraction using Iterated Graph Cut
  • Background subtraction
  • Statistical modeling of images: mixture of Gaussians, E-M, Markov processes, Markov random fields
  • Texture in Boundary Detection
  • Texture Synthesis vs. Pattern Repeated
  • Image features – corners
  • Image matching – RANSAC, Hough transform
  • Geometric transformations
  • Mosaicking
  • Tracking
  • Biological vision
  • Cameras and perspective
  • Stereo
  • Optical Flow
  • SFM
    • Image space: $$I_1 = [u, v, 1]$$
    • Camera space: $$C_1 = [C_x, C_y, C_z]$$
    • World space: $$W_1 = [W_x, W_y, W_z$$
    • The essential matrix projects two points in two camera spaces: $$C_a E_{ab} C_b = 0$$, it contains translation and rotation
      • $$E_{ab} = C_ba^T r_b^{ab}$$
      • $$x_1 = O_1P, x_2 = O_2P = Rx_1 + t$$
      • $$x_2 \cdot (t \times (R x_1)) = 0$$
      • $$x_2^T (t^R) x_1 = 0$
    • The fundamental matrix projects two points in two image spaces: $$I_a^T F_{ab} I_b = 0$$
      • $$F_{ab} = K_a^{-T} E_{ab} K_b^{-1}$$
      • $$q_a^T F_{ab} q_b = p_b^T K_a^T K_a^{-T} E_{ab} K_b^{-1} K_b p_b = 0$$
      • $$rank(F) = 2$$
    • Line: $$l = [a b c]$$, so that $$ax + by + c = 0$$
    • Point in line: $$x = [u v 1], x^T l = 0$$
    • How to compute a line through two points: $$l = x_1 \times x_2$$ 
    • 3D => 2D: $$I = Projection \cdot W$$
    • 2D => 3D: $$W = Projection \cdot x + \lambda C$$, C here is the center of the camera
    • Line: $$l = C \times P^+x$$, here $P^+$ is a pseudo inverse matrix.
    • Epipolar line: $$l = (P’C) \times (P’P^+x)$$
    • Essential matrix is a fundamental matrix with camera calibration
  • Lighting, photometric stereo
  • Classification – including fine-grained classification
  • Shape
  • Detection: Face priority AE, when a bright part of the face is too bright; Google Street View: face blur

My Partial Note:


My Homework Screenshots:

Edge detection

Feature Selection and SVM Classification:

 Screenshot 2013-12-08 22.32.55Screenshot 2015-05-21 22.27.49Screenshot 2013-10-12 16.52.38

Photo Stitching using SIFT and Transformation



Screenshot 2015-05-21 22.28.42


Stereo Matching

Screenshot 2015-05-21 22.29.03

After alpha beta expansion



  • Correspondence=C between images
    • Stereo is all about C
    • OF is C/ linear approximation
    • Classification requires C
  • Image gradients=G and image change
    • Edge detection = G
    • Corners: G in 2 directions
    • OF: combines temporal and spatial G
    • Biological vision sensitive to changes
  • Statistical Modeling
    • Diffusion processes, background, texture, classes

Depth: Key equations and algorithms

  • Convolution theorem
  • Diffusion equation
  • Brightness change constraint equation
  • Matrix factorization for SFM
  • Graph cuts for MRF
  • Dynamic Programming for Stereo

Course Work

  • Lectures
  • Problem sets / projects (40%)
    • Implement 5 classic algorithms, Matlab (Calculus, Geometry, Probability), CS, Signal Processing
  • Midterm – Take home (20%)
  • Final Exam – In class (40%)
    • Equation, algorithms
  • Readings
    • Listed on class web page
    • Optional texts: Szeliski, Forsyth and Ponce

Fourier Transform -> Convolution (Sharpen, derivatives) -> Smoothing (denoise, resize)


Cosine and Gaussian Transforms



Kalman Filter:



A Review of Nonlinear Diffusion Filter:



Particle Filtering:



Products and Convolutions of Gaussian Probability Density Functions



Bags of Words:



Leave a Reply

Your email address will not be published. Required fields are marked *