Off-by-one on range boundaries
Wrong move: Loop endpoints miss first/last candidate.
Usually fails on: Fails on minimal arrays and exact-boundary answers.
Fix: Re-derive loops from inclusive/exclusive ranges before coding.
Build confidence with an intuition-first walkthrough focused on core interview patterns fundamentals.
Table: prompts
+-------------+---------+ | Column Name | Type | +-------------+---------+ | user_id | int | | prompt | varchar | | tokens | int | +-------------+---------+ (user_id, prompt) is the primary key (unique value) for this table. Each row represents a prompt submitted by a user to an AI system along with the number of tokens consumed.
Write a solution to analyze AI prompt usage patterns based on the following requirements:
2 decimal places).3 prompts.tokens greater than their own average token usage.Return the result table ordered by average tokens in descending order, and then by user_id in ascending order.
The result format is in the following example.
Example:
Input:
prompts table:
+---------+--------------------------+--------+ | user_id | prompt | tokens | +---------+--------------------------+--------+ | 1 | Write a blog outline | 120 | | 1 | Generate SQL query | 80 | | 1 | Summarize an article | 200 | | 2 | Create resume bullet | 60 | | 2 | Improve LinkedIn bio | 70 | | 3 | Explain neural networks | 300 | | 3 | Generate interview Q&A | 250 | | 3 | Write cover letter | 180 | | 3 | Optimize Python code | 220 | +---------+--------------------------+--------+
Output:
+---------+---------------+------------+ | user_id | prompt_count | avg_tokens | +---------+---------------+------------+ | 3 | 4 | 237.5 | | 1 | 3 | 133.33 | +---------+---------------+------------+
Explanation:
The Results table is ordered by avg_tokens in descending order, then by user_id in ascending order
Problem summary: Table: prompts +-------------+---------+ | Column Name | Type | +-------------+---------+ | user_id | int | | prompt | varchar | | tokens | int | +-------------+---------+ (user_id, prompt) is the primary key (unique value) for this table. Each row represents a prompt submitted by a user to an AI system along with the number of tokens consumed. Write a solution to analyze AI prompt usage patterns based on the following requirements: For each user, calculate the total number of prompts they have submitted. For each user, calculate the average tokens used per prompt (Rounded to 2 decimal places). Only include users who have submitted at least 3 prompts. Only include users who have submitted at least one prompt with tokens greater than their own average token usage. Return the result table ordered by average tokens in descending order, and then by user_id in ascending order. The result
Start with the most direct exhaustive search. That gives a correctness anchor before optimizing.
Pattern signal: General problem-solving
{"headers":{"prompts":["user_id","prompt","tokens"]},"rows":{"prompts":[[1,"Write a blog outline",120],[1,"Generate SQL query",80],[1,"Summarize an article",200],[2,"Create resume bullet",60],[2,"Improve LinkedIn bio",70],[3,"Explain neural networks",300],[3,"Generate interview Q&A",250],[3,"Write cover letter",180],[3,"Optimize Python code",220]]}}Source-backed implementations are provided below for direct study and interview prep.
// Accepted solution for LeetCode #3793: Find Users with High Token Usage
// Auto-generated Java example from py.
class Solution {
public void exampleSolution() {
}
}
// Reference (py):
// # Accepted solution for LeetCode #3793: Find Users with High Token Usage
// import pandas as pd
//
//
// def find_users_with_high_tokens(prompts: pd.DataFrame) -> pd.DataFrame:
// df = prompts.groupby("user_id", as_index=False).agg(
// prompt_count=("user_id", "size"),
// avg_tokens=("tokens", "mean"),
// max_tokens=("tokens", "max"),
// )
//
// df["avg_tokens"] = df["avg_tokens"].round(2)
//
// df = df[(df["prompt_count"] >= 3) & (df["max_tokens"] > df["avg_tokens"])]
//
// df = (
// df.sort_values(["avg_tokens", "user_id"], ascending=[False, True])
// .loc[:, ["user_id", "prompt_count", "avg_tokens"]]
// .reset_index(drop=True)
// )
//
// return df
// Accepted solution for LeetCode #3793: Find Users with High Token Usage
// Auto-generated Go example from py.
func exampleSolution() {
}
// Reference (py):
// # Accepted solution for LeetCode #3793: Find Users with High Token Usage
// import pandas as pd
//
//
// def find_users_with_high_tokens(prompts: pd.DataFrame) -> pd.DataFrame:
// df = prompts.groupby("user_id", as_index=False).agg(
// prompt_count=("user_id", "size"),
// avg_tokens=("tokens", "mean"),
// max_tokens=("tokens", "max"),
// )
//
// df["avg_tokens"] = df["avg_tokens"].round(2)
//
// df = df[(df["prompt_count"] >= 3) & (df["max_tokens"] > df["avg_tokens"])]
//
// df = (
// df.sort_values(["avg_tokens", "user_id"], ascending=[False, True])
// .loc[:, ["user_id", "prompt_count", "avg_tokens"]]
// .reset_index(drop=True)
// )
//
// return df
# Accepted solution for LeetCode #3793: Find Users with High Token Usage
import pandas as pd
def find_users_with_high_tokens(prompts: pd.DataFrame) -> pd.DataFrame:
df = prompts.groupby("user_id", as_index=False).agg(
prompt_count=("user_id", "size"),
avg_tokens=("tokens", "mean"),
max_tokens=("tokens", "max"),
)
df["avg_tokens"] = df["avg_tokens"].round(2)
df = df[(df["prompt_count"] >= 3) & (df["max_tokens"] > df["avg_tokens"])]
df = (
df.sort_values(["avg_tokens", "user_id"], ascending=[False, True])
.loc[:, ["user_id", "prompt_count", "avg_tokens"]]
.reset_index(drop=True)
)
return df
// Accepted solution for LeetCode #3793: Find Users with High Token Usage
// Rust example auto-generated from py reference.
// Replace the signature and local types with the exact LeetCode harness for this problem.
impl Solution {
pub fn rust_example() {
// Port the logic from the reference block below.
}
}
// Reference (py):
// # Accepted solution for LeetCode #3793: Find Users with High Token Usage
// import pandas as pd
//
//
// def find_users_with_high_tokens(prompts: pd.DataFrame) -> pd.DataFrame:
// df = prompts.groupby("user_id", as_index=False).agg(
// prompt_count=("user_id", "size"),
// avg_tokens=("tokens", "mean"),
// max_tokens=("tokens", "max"),
// )
//
// df["avg_tokens"] = df["avg_tokens"].round(2)
//
// df = df[(df["prompt_count"] >= 3) & (df["max_tokens"] > df["avg_tokens"])]
//
// df = (
// df.sort_values(["avg_tokens", "user_id"], ascending=[False, True])
// .loc[:, ["user_id", "prompt_count", "avg_tokens"]]
// .reset_index(drop=True)
// )
//
// return df
// Accepted solution for LeetCode #3793: Find Users with High Token Usage
// Auto-generated TypeScript example from py.
function exampleSolution(): void {
}
// Reference (py):
// # Accepted solution for LeetCode #3793: Find Users with High Token Usage
// import pandas as pd
//
//
// def find_users_with_high_tokens(prompts: pd.DataFrame) -> pd.DataFrame:
// df = prompts.groupby("user_id", as_index=False).agg(
// prompt_count=("user_id", "size"),
// avg_tokens=("tokens", "mean"),
// max_tokens=("tokens", "max"),
// )
//
// df["avg_tokens"] = df["avg_tokens"].round(2)
//
// df = df[(df["prompt_count"] >= 3) & (df["max_tokens"] > df["avg_tokens"])]
//
// df = (
// df.sort_values(["avg_tokens", "user_id"], ascending=[False, True])
// .loc[:, ["user_id", "prompt_count", "avg_tokens"]]
// .reset_index(drop=True)
// )
//
// return df
Use this to step through a reusable interview workflow for this problem.
Two nested loops check every pair or subarray. The outer loop fixes a starting point, the inner loop extends or searches. For n elements this gives up to n²/2 operations. No extra space, but the quadratic time is prohibitive for large inputs.
Most array problems have an O(n²) brute force (nested loops) and an O(n) optimal (single pass with clever state tracking). The key is identifying what information to maintain as you scan: a running max, a prefix sum, a hash map of seen values, or two pointers.
Review these before coding to avoid predictable interview regressions.
Wrong move: Loop endpoints miss first/last candidate.
Usually fails on: Fails on minimal arrays and exact-boundary answers.
Fix: Re-derive loops from inclusive/exclusive ranges before coding.