Taro Logo

Article Views I

Easy
Meta logo
Meta
Topics:
Database Problems

You are given a table named Views that records which viewers have seen articles written by specific authors on certain dates. The table has the following schema:

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| article_id    | int     |
| author_id     | int     |
| viewer_id     | int     |
| view_date     | date    |
+---------------+---------+
  • article_id is the ID of the article.
  • author_id is the ID of the author who wrote the article.
  • viewer_id is the ID of the viewer who viewed the article.
  • view_date is the date when the article was viewed.

Note that there is no primary key, and the table may contain duplicate rows. Also, equal author_id and viewer_id indicate the same person.

Write an SQL query to find all authors who have viewed at least one of their own articles. The result table should contain a single column named id, representing the author IDs. Return the result table sorted in ascending order by id.

For example, consider the following Views table:

+------------+-----------+-----------+------------+
| article_id | author_id | viewer_id | view_date  |
+------------+-----------+-----------+------------+
| 1          | 3         | 5         | 2019-08-01 |
| 1          | 3         | 6         | 2019-08-02 |
| 2          | 7         | 7         | 2019-08-01 |
| 2          | 7         | 6         | 2019-08-02 |
| 4          | 7         | 1         | 2019-07-22 |
| 3          | 4         | 4         | 2019-07-21 |
| 3          | 4         | 4         | 2019-07-21 |
+------------+-----------+-----------+------------+

The expected output is:

+------+
| id   |
+------+
| 4    |
| 7    |
+------+

In this example, authors with IDs 4 and 7 have viewed their own articles.

Solution


Naive Solution

The most straightforward approach is to iterate through the Views table and check for each row if the author_id is equal to the viewer_id. If they are equal, it means the author viewed their own article. We collect these author IDs and then return them in ascending order, ensuring we only list each author once.

SQL Query

SELECT DISTINCT author_id AS id
FROM Views
WHERE author_id = viewer_id
ORDER BY id;

Big O Analysis

  • Time Complexity: O(N * log(N)) - The DISTINCT operation and ORDER BY might require sorting, giving the N * log(N) complexity. The initial scan is O(N).
  • Space Complexity: O(N) - In the worst-case scenario, where every row satisfies the condition author_id = viewer_id, we would store each distinct author ID.

Optimal Solution

The naive solution is, in most cases, already quite optimal for this problem, given the problem constraints. However, we can ensure efficiency by directly targeting the relevant rows where author_id equals viewer_id.

SQL Query

SELECT DISTINCT author_id AS id
FROM Views
WHERE author_id = viewer_id
ORDER BY id;

Big O Analysis

  • Time Complexity: O(N * log(N)) - Same as the naive solution; the DISTINCT and ORDER BY operations dominate the complexity.
  • Space Complexity: O(N) - Same as the naive solution; proportional to the number of distinct authors viewing their articles.

Edge Cases

  • Empty Table: If the Views table is empty, the query will return an empty result set, which is the correct behavior.
  • No Matching IDs: If no rows satisfy the condition author_id = viewer_id, the query will return an empty result set.
  • Duplicate Rows: The DISTINCT keyword handles duplicate rows effectively, ensuring each author is only listed once.
  • Large Table: For extremely large tables, ensure that appropriate indexes are in place to speed up the query. However, given the problem's purpose and scale, indexing considerations are generally not necessary for interview purposes.