Demo 1: Collecting Data About Algorithms

An interactive exploration of the performance spectrum of ordinal algorithms for k-secretary. Based on research by Buchbinder, Jain, Singh (2014) and Chan, Chen, Jiang (2015).

The K-Secretary Problem

Input: \(n\) items arrive in uniformly random order with a total ordering (only relative comparisons are observable). An algorithm has \(K\) quotas for selecting items, and decisions are irrevocable.

Goal: Maximize \(\mathbb{E}[\text{number of selected items among the } K \text{ best overall}]\)

$$\text{Competitive Ratio} = \frac{\mathbb{E}[\text{number selected among } K \text{ best}]}{K}$$

K-Potential

An item arriving at step \(i\) is a \(k\)-potential if it ranks among the top \(k\) items seen so far (steps \(1, \ldots, i\)). This is the only information available to an ordinal algorithm.

$$\text{k-potential} = \text{rank } k \text{ among items } \{1, \ldots, i\}$$

Ordinal Algorithm

An ordinal algorithm makes selection decisions based solely on relative rankings, not cardinal values. It can only observe whether the current item is a 1-potential, 2-potential, ..., or K-potential.

Quotas

The algorithm has \(K\) quotas \(Q_1, Q_2, \ldots, Q_K\) for selecting items. Once a quota is used, it cannot be reused. The algorithm must decide at each step \(i\) whether to use a remaining quota to select the current item.

Decision Variables

For each quota \(Q_j\) (\(j = 1, \ldots, K\)), k-potential (\(k = 1, \ldots, K\)), and step \(i\) (\(i = 1, \ldots, n\)):

$$z_{j|k}(i) = \Pr[\text{item at step } i \text{ selected using quota } Q_j \mid \text{it is a } k\text{-potential}]$$

with bounds \(0 \leq z_{j|k}(i) \leq 1\) for all \(j, k, i\).

Objective Function

Maximize the expected number of selected items among the \(K\) best:

$$\max \sum_{j=1}^K \sum_{k=1}^K \sum_{i=1}^n \frac{1}{n} \sum_{\ell=k}^K \delta_{k|\ell}(i) \cdot z_{j|k}(i)$$

where \(\delta_{k|\ell}(i)\) is the probability that the item at step \(i\) is a \(k\)-potential given it is the \(\ell\)-th best overall:

$$\delta_{k|\ell}(i) = \frac{\binom{n-i}{\ell-k} \binom{i-1}{k-1}}{\binom{n-1}{\ell-1}}$$

Constraints

1. Consistency Constraints (for \(j = 1, \ldots, K-1\)):

For each quota \(Q_j\), k-potential, and step \(i\):

$$z_{j|k}(i) \leq \sum_{m=1}^{i-1} \sum_{\ell=1}^K \frac{1}{m} \left[ z_{j+1|\ell}(m) - z_{j|\ell}(m) \right]$$

These constraints ensure that the probability of using quota \(Q_j\) at step \(i\) is consistent with having previously used quotas \(Q_{j+1}, \ldots, Q_K\) but not yet used \(Q_j\).

2. Final Quota Constraints (for \(j = K\)):

For the last quota \(Q_K\), at each step \(i\):

$$z_{K|k}(i) \leq 1 - \sum_{m=1}^{i-1} \sum_{\ell=1}^K \frac{1}{m} z_{K|\ell}(m)$$

Interpretation

The LP encodes the optimal randomized online algorithm for the K-secretary problem. Each variable \(z_{j|k}(i)\) represents the conditional selection probability at step \(i\), conditioned on the item being a \(k\)-potential (i.e., among the top \(k\) items seen so far).

Algorithm Structure

The optimal solution to the LP defines a K-threshold algorithm: at each step \(i\), the algorithm observes the current item's k-potential status and randomly selects it with probability \(z_{j|k}(i)\) using quota \(Q_j\).

Key Properties

Performance

For \(n = 30\) and \(K = 2\), the optimal LP solution achieves:

$$V_{\text{best}} \approx 1.013 \quad \Rightarrow \quad \text{CR} \approx 0.506$$

This means the algorithm selects approximately 50.6% of the top 2 items on average.

Analyzing the Class of Ordinal Algorithms

The LP solution finds a single, fully optimal algorithm. We can, however, analyze the class of ordinal algorithms by parameterizing them.

We formalize this by introducing an external agent ("nature") that has a "budget" \(B\). This budget allows the agent to choose \(B\) of the algorithm's decision variables and fix them to specific values \(\vec{c}\). This defines a subclass of algorithms, allowing us to analyze the performance spectrum.

Formalization

Let \(y \in \{0,1\}^{\text{vars}}\) be a binary vector indicating which variables nature fixes, with \(\sum_v y_v = B\). For simplicity, we consider the case where nature fixes variables to zero (\(\vec{c} = \vec{0}\)). The modified LP becomes:

$$Q(y) = \max \sum_{j,k,i} c_{j|k}(i) \cdot z_{j|k}(i)$$ $$\text{subject to: LP constraints, and } z_{j|k}(i) = 0 \text{ for all } (j,k,i) \text{ where } y_{j,k,i} = 1$$

Performance Bounds

Best-case performance:

$$V_{\text{best}}(B) = \max_{y: \sum y_v = B} Q(y) = V_{\text{best}}(0)$$

Worst-case performance:

$$V_{\text{worst}}(B) = \min_{y: \sum y_v = B} Q(y)$$

Computing \(V_{\text{worst}}(B)\) requires solving \(\binom{|\text{vars}|}{B}\) LPs (one for each choice of \(B\) variables). For small \(B\), this is feasible via enumeration. (For larger \(B\) we can later implement a dualized and combined formulation and/or decomposition techniques.)

Interference Parameter \(\alpha\)

The interference parameter \(\alpha \in [0,1]\) parameterizes the performance degradation threshold. For given budget \(B\) and interference level \(\alpha\), we identify which variable configurations \(y\) achieve performance:

$$Q(y) \geq (1-\alpha) V_{\text{best}}(B) + \alpha V_{\text{worst}}(B)$$

Interpretation:

The \(\alpha\)-Spectrum

By sweeping \(\alpha\) from 0 to 1, we obtain the \(\alpha\)-spectrum: a curve showing how performance degrades as we relax the quality threshold.

Immediate Insights: The spectrum directly reveals:

Connection to DIAL: Beyond these immediate observations, the \(\alpha\)-spectrum embodies the core principle of DIAL (Data-driven Interpretability of Algorithm Logic) introduced on the front page. By "turning the dial" from \(\alpha = 0\) (optimal) to \(\alpha = 1\) (worst-case), we systematically probe the algorithm's behavior under varying levels of interdiction. This traces the full performance spectrum and maps out a multi-dimensional response surface where performance is a direct function of the interdiction strategy. The broader purpose includes:

Interactive Demo: Vworst Explorer

Configuration

Select which variables nature can interfere with:

Select parameters and click "Run Analysis" to see results.

Interactive Demo: Interference Spectrum

Configuration

Explore the full α-spectrum showing performance degradation:

Interactive Demo: 3D Surface Analysis

Configuration

Visualize the complete performance landscape across both budget B and interference α: