This list of questions has been floating around on the open internet for a while but we have yet to find the original source. If you know of the source or are the source then please contact jacob@moneyscience.com and we will take appropriate action.

**Question #1 (statistics)**

How do you test whether a data sample is normal or not?

(Comment: this tests your knowledge in basic statistics; there are several different methods to test normality.)

**Question #2 (math)**

Show that a set is convex if and only if its intersection with any line is convex. Show that a set is affine if and only if its intersection with any line is affine.

(Taken from chapter 2 of Boyd and Vandenberghe, Convex Optimization)

(Comment: concepts related to convex sets are important in optimization, and many quant jobs involve optimization research.)

**Question #3 (finance – bond pricing)**

I don’t know anything about bond pricing, but I’ve heard people use something called the discount rate when they price bonds. Can you explain what this “discount rate” is? Why is it important? Where do I get its value? What is the current discount rate (as of today)? When I price a 30-year bond, should I use today’s discount rate or should I use a different discount rate for each of the next 30 years?

(Comment: typical set of basic questions asked at a lot of quant interviews, not just those for fixed income-related positions)

**Question #4 (probability theory)**

Say you are on a game show [historical sidenote: this question was first played on the 60s American game show Let’s Make a Deal, hosted by Monty Hall], and there are three closed doors. Behind one door is a car, the prize you dream of, and behind the other two are goats. You pick a door. The host, who knows what’s behind each door, opens another which reveals a goat. Now, the host lets you make another choice: should you stick with you first door choice, or should you switch and pick a different door, in order to win the car?

**Question #5 (applied math)**

What’s a Hermitian matrix? What important property does a Hermitian matrix’s eigenvalues possess? What’s the practical implication of this property in applications?

**Question #6 (statistics)**

Random variable X is distributed as N(a, b), and random variable Y is distributed as N(c, d). What is the distribution of (1) X+Y, (2) X-Y, (3) X*Y, (4) X/Y?

(Comment: another very popular quant interview question, regardless of whether the position itself involves statistical modeling)

**Question #7 (finance – options)**

What’s put-call parity in option pricing? How does one derive this relationship? What crucial assumptions are necesary?

Tough case question: if you observe put-call parity not currently holding in the market, how do you make money off this observation? As you trade, what do you need to watch out for and what risks must you be aware of?

**Question #8 (math – stochastics)**

Show that exp{-t/2 + W(t)} is a martingale.

[Courtesy of Dr. Yun Cheng of ITG]

**Question #9 (programming – C++)**

Is the following valid C++ code? If so, what does it print?

cout << (int *) “Home of the jolly bytes”;

[Taken from chapter 4 of Prata, C++ Primer Plus (5th ed.)]

**Question #10 (econometrics – time series)**

Is an AR(p) process stationary? Why or why not?

Tough question: in practice, how do you determine the order of an MA or AR model?

**Question #11 (econometrics – time series)**

What’s a GARCH model? Why is it an important/useful model? When would you use the GARCH model?

Can you write down its general formulation? What does the GARCH model say in plain English? What does it “try” to achieve?

How do you determine the order of the model? How do you estimate the model in practice?

Tough follow-up question: how do you implement a GARCH model in Excel?

**Question #12 (econometrics – time series)**

People use the GARCH model to study volatility. Can you tell me if we can use the GARCH framework to study the correlation between two assets/time series? If so, what additional assumptions and/or adjustments must we make to the original GARCH model?

**Question #13 (brainteaser)**

With an ordinary tape measure and a watch, how would you measure the exact height of the Empire State Building (or the Sears Tower, or the Big Ben Clock Tower, or the Oriental Pearl TV Tower, or any famous tall building)?

**Question #14 (brainteaser – logic – deduction)**

(There are many versions of this type of question. Here are some examples.)

1. How many pizzas are consumed every day in the U.S.?

2. How many gas stations are there in the U.S.?

3. How many cars are stolen every month in the U.S.?

4. How many prostitutes do you think work the streets in New York (or London, or L.A., or Shanghai, or Tokyo, or Singapore, …)?

5. How many quants are there in the world?

6. How many people make their livings on Wall Street?

7. How many university graduates try to find a job on Wall Street each year?

8. How many tennis balls can you fit in a Boeing 747 (or Airbus A320)?

9. How many Yankees fans go to every home game each season? [I was asked this at the last round of my McKinsey interviews]

10. How many people in China can speak English? [A little different from the previous nine…]

**Question #15 (case question hard to categorize)**

You work for an arbitrage desk. Your model shows that if you bought stock A and simultaneously sold stock B, you have a 51.3% chance of making a profit by today’s close. Should you make this trade?

**Question #16 (applied math – control theory)**

The latest “hot” topic in financial research is using the Kalman filter in various applications. Can you explain the basic idea behind the Kalman filter (i.e., what does the filter try to do with the data)? Can you write the basic Kalman filter model? What are some of the applications of the Kalman filter?

Tough question: How do you estimate (or implement) the Kalman filter? For example, to study stock price movement.

**Question #17 (finance – asset pricing)**

Tell me the intuition behind CAPM. Can you write down the model? What does each of the variables stand for?

Two tough advanced questions: How do you test CAPM using real data? What are the major points of criticism against CAPM?

**Question #18 (econometrics)**

When modeling binary-choice problems, what are the advantages of using logit over probit? What are the disadvantages of logit vs. probit?

What about multiple-choice models: is logit or probit better?

**Question #19 (finance)**

What does VaR (value at risk) measure? What are some of the assumptions behind the VaR concept? Given two portfolios A and B, does the following relationship hold: VaR(A+B) = VaR(A) + VaR(B)? Why or why not (i.e., prove your previous answer)?

**Question #20 (applied math – stochastic calculus)**

What is Ito’s Lemma? What is its significance in studying stochastic processes? How is it used in finance? Can you write out the equation?

When used to model financial derivatives, what assumptions must be made of the properties of the derivatives for Ito’s Lemma to be applied correctly?

**Question #21 (programming)**

I give you a text file, x.txt, which has millions of records with three columns in each record:

ID, age, income

The records are sorted by ID, and no two IDs are the same.

Now, write a short program in each of the following languages to pull out 10,000 randomly selected records from x.txt. Put these 10,000 randomly pulled records in an output file called y.txt.

C++

Visual Basic

Matlab

Perl

Python

SAS

R or S-Plus

UNIX shell script

**Question #22 (programming)**

This is a tougher version of the previous question (#21).

You get the same input file x.txt with millions of records sorted by ID. However, some records are missing either age or income.

Now, your task is to write a program to pull out a random sample of 10,000 records, but only those with neither age nor income missing.

(Comment: both questions #21 and #22 test your ability to both write a working program and to produce an efficient program – but foremost you must write a program that works correctly)

**Question #23 (applied math – linear algebra)**

In linear algebra, why are we interested in matrix decompositions? Explain each of the following:

LU decomposition

Singular value decomposition (SVD)

Cholesky decomposition

QR decomposition

When and how is each of these decomposition techniques applied?

(Comment: matrix operations, including decompositions, are extremely important in applied quantitative finance – they are often the clue between modeling and implementation)

**Question #24 (mathematical brainteaser)**

Answer this as fast as you can, without writing anything down:

The perimeter of a right triangle is 5 inches. The two legs are each 2 inches long. What’s the length of the hypotenuse?

(Comment: this is one of my favorite questions)

**Question #25 (finance – asset pricing)**

Can you show me how the APT model is derived? What’s the intuition behind APT? How does it compare to CAPM? What are some of the criticisms of APT?

**Question #26 (mathematics)**

What is Jensen’s Inequality? What are some of its applications? Can you write out the inequality and provide a sketch of a proof?

(Hint: Jensen’s Inequality is an important concept in probability theory; other important inequalities include HÃ¶lder’s Inequality and Minkowski’s Inequality)

**Question #27 (economics – game theory)**

What’s a Nash equilibrium? Can you write down its formal definition? Can you provide an example?

**Question #28 (programming – SQL)**

In SQL, what’s an inner join and what’s an outer join? What’s the difference between a left join and a right join?

(Comment: SQL is the standard programming language in the database world, and more and more quant shops are setting up SQL-based databases and data warehouses)

**Question #29 (statistics)**

How do you calculate sample variance? Show me the formula and implement it in C or C++.

**Question #30 (finance case question)**

There are two stocks A and B. I already own A, but I’m thinking of buying B to replace A. (I can only own either A or B at the same time.) I’m a U.S.-based investor subject to all U.S. taxes. How will my tax situation affect my decision whether to keep A, or to sell A and buy B? Please explain in detail.

[Courtesy of Dr. Warren Hrung of the New York Fed]

**Question #31 (probability theory)**

There are 30 people in my group. What are the odds that at least two people share the same birth month and day (e.g., July 25). What are the odds that exactly two people share the same birth month and day? Finally, what are the odds that everybody was born in the same decade (where a decade is defined as any ten-year span, not necessarily “50s” or “60s” or “70s” etc.)?

**Question #32 (statistics)**

What’s the difference between the t-stat and R2 in a regression? What does each measure? When you get a very large value in one but a very small value in the other, what does that tell you about the regression?

**Question #33 (finance – options)**

Can you plot an option’s delta as a function of the underlying stock’s price? What does this plot tell you?

**Question #34 (finance – portfolio theory)**

Consider the utility function U(W) = W-1/2 . What are the characteristics of this function with respect to absolute and relative risk aversion?

Explain the difference between absolute and relative risk aversion.

[First question taken from chapter 10 of Elton, et al. Modern Portfolio Theory and Investment Analysis]

**Question #35 (programming – Perl)**

In Perl, given a hash %bonus where the key is employee ID and the value represents the employee’s expected year-end bonus, sort this hash by value from highest bonus to lowest.

Bonus question: how would you do this whole ID–>bonus mapping and sorting in C++ or C#?

**Question #36 (brainteaser)**

(The interviewer writes down the following equation on the whiteboard…)

XI + I = X

This is an equation expressed in Roman numerals. Imagine this equation is actually written out using sticks. Without touching or adding any stick, how can you make this equation true?

**Question #37 (financial economics-related case question)**

When you trade stocks, what are some of the different types of cost associated with your trading? How would you mitigate each type of cost?

(Hint: a cost need not be explicit…)

**Question #38 (econometrics)**

What are some of the causes of heteroskedasticity? How do you test for the presence of heteroskedasticity? (Please name at least two tests.) Finally, what are some of the techniques for dealing with heteroskedasticity?

**Question #39 (financial time series)**

What is Principal Component Analysis? Please explain in plain English as well as write down the model.

How does PCA differ from factor analysis?

(Comment: PCA is used heavily in studying asset returns; it is, for instance, a backbone of statistical arbitrage models)

**Question #40 (mathematics – number theory)**

Can you show that, for any prime number p that is at least equal to 5, the value of p2-1 is a multiple of 24 (i.e., wholly divisible by 24)?

**Question #41 (probability theory)**

You are offered to play a game of chance. A fair coin is tossed repeatedly until you get the first tails, at which point the game ends and you get the prize. The prize “pot” starts at $1 and doubles each time you get heads. So for instance, if you get heads the first toss, the pot becomes $2. If you get heads again the second toss, the pot becomes $4. If you get heads the third time, the pot becomes $8. If the fourth toss gives you the tail of the coin, you win and take home the $8 prize.

Before you play, you must pay a fee to enter this game. The question is, what’s the maximum amount you’re willing to pay in order to play this game? Explain your answer carefully.

**Question #42 (probability theory)**

What’s the expectation of a uniform(a, b) distribution? What’s its variance? Please derive your answers in mathematical terms, starting with the pdf.

**Question #43 (finance – derivatives)**

Kindly explain the difference between a futures contract and a forward contract. How are they priced differently?

**Question #44 (statistics)**

Given a dataset, how do you determine its sample distribution? Please provide at least two methods.

**Question #45 (mathematics – algebra)**

Let n be a natural number. Give the reduced expression for the following:

(1) 1+2+3+…+n

(2) 1+22+32+…+n2

(3) 1+23+33+…+n3

(4) 1+2k+3k+…+nk, where k is another natural number.

**Question #46 (programming – C++)**

What are virtual functions in C++? What are they used for? Please write down an example of a virtual function to illustrate its usage.

**Question #47 (applied math – stochastics)**

A random walk process starts at the point 0. What is the probability that this random walk hits -2 before it hits 3? What if the process is a Brownian motion instead?

[Courtesy of Dr. Yun Cheng of ITG]

**Question #48 (finance – options)**

What is the lower bound for the price of a European call option on a non-dividend-paying stock? Can you derive this lower bound in a formal fashion?

Now, what if the call option is American? What if the stock pays a dividend every quarter?

**Question #49 (general computing – Excel)**

There are at least two ways in Excel to perform an OLS regression. What are they? What are some of the limitations of doing OLS in Excel (as opposed to using a real statistical package like EViews, Stata, R, S-Plus, or SAS)?

**Question #50 (finance – capital market)**

Why do price spreads exist in asset-trading markets? Can spreads ever be negative? If so, under what conditions?

Tougher: what are some examples of markets where price spreads do not necessarily exist?