Coding with Gemini: A Personalized News Article Service

Transport plane

Have you ever wanted your own personal news feed that let you (1) browse a huge, almost real-time stream of news articles from many news sources and (2) selectively view and/or filter articles based upon central topic, keywords, perspective, publication date/time? I have that objective and am utilizing Gemini, Google’s multimodal large language model (LLM) to build such an application.

Objective: Build a news article service that…

1. Periodically (e.g., every 2 hours) gathers current news articles from a wide variety of news sources.

2. Categorizes each article (e.g., Politics, Business, Education, Technology/Science, Government, Health, Entertainment, Legal, Community-Oriented, Real Estate, Military/Warfare).

3. Extracts some keywords from the article’s content.

4. Captures the articles into an ongoing repository from which they can be further analyzed and documented, including various types of similarities.

5. Provides an interface to the repository that accommodates a wide range of access opportunities.

Staged Approach/Plan

1. Set up the News Article Repository mentioned in Objective #4 above. Each article in the repository has several keys including one which includes the timestamp for when that batch of articles was pulled via NewsApi.org. Initially, the repository is empty.

2. Use an AI bot (Google Gemini) to generate an executable Python script that performs Objectives #1 – #3 above.  This has been accomplished via the following request to Google Gemini.

link to Google Gemini response to this request, including: (1) technical approach, (2) libraries/modules utilizes, (3) code and (4) suggested enhancements

3. Adjust/tune the Python to fix errors and store each new (i.e., not currently in the repository) batch of articles pulled into the repository.

a. Note 3.1: The uniqueness of articles in the repository is guaranteed via their respective URL.

b. Note 3.2: This work is mostly accomplished, but somewhat still in process.

4. Use an AI bot (Google Gemini) to generate a Python script that analyzes pairs of text (article’s content) for degree of similarity of the keywords in text pairs.

This has been mostly accomplished…

a. Note 4.1: This is quite straightforward computationally.

b. Note 4.2: Keyword similarity is not synonymous with true content similarity…stay tuned…

link to Google Gemini response to this request, including: (1) technical approach, (2) libraries/modules utilizes, (3) code and (4) suggested enhancements

5. Use an AI bot (Google Gemini) to generate a Python script that analyzes pairs of text (article’s content) for degree of semantic similarity (i.e., similarity of meaning) of the text pairs.

a. Note 5.1: This is much more complex and perhaps marginal in terms of true attainability.

b. Note 5.2: It will employ several AI techniques including sentence-transformers…

c. Note 5.3: Fixing and tuning the generated script is a work in progress…

link to Google Gemini response to this request, including: (1) technical approach, (2) libraries/modules utilizes, (3) code and (4) suggested enhancements

Observations, Lessons Learned, Going Forward: I have pondered and experimented with possible realizations of the application off-and-on for quite a while. I never doubted that news articles themselves and diverse news sources are readily available. But most news services are narrow, inflexible, and polarized in terms of perspective.

And truly understanding news article content is anything but simple for a machine, although humans do very well with it. AI has recently advanced, at least somewhat, in this regard, which encourages me to experiment further, including assessing meaning, perpective, similarity, etc. Hence this effort. I very much welcome comments, suggestions, collaborations…

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *