Published on 15.11.2025

According to the international research company 360iResearch, the global big data market is expected to reach USD 284.91 billion by 2025. Big data is becoming a key driver of the global economy. It enables analysts to work with huge volumes of information using a variety of methods and tools. The main goal is to extract valuable insights, identify patterns and trends, and support well-informed decision-making. Data processing helps assess project viability, reduce implementation risks, and even prevent cybercrime, says data analyst and PhD in engineering sciences Iaroslav Argunov. His proprietary method for high-level cost estimation and forecasting enables the planning of budgets and timelines for large construction projects at an early stage. At the same time, an AI-based anti-fraud system protects website users from phishing ads. We spoke with the talented developer about how his methodology helps in urban renovation and the role of artificial intelligence in data processing.

— Iaroslav, why is big data analysis so critical?
— It helps reveal hidden patterns and trends, which in turn allows solving complex problems, forecasting the future, and optimizing processes. Data analysis enables the identification of connections between events and the understanding of human behavior. I believe it is a powerful tool for research, discovery, and innovation that makes our world a better place.

— You have worked with big data in many different fields, which led you to develop a high-level estimation and forecasting methodology that was featured in the journal Industrial and Civil Construction. What is its essence, briefly?
— The idea is to quickly estimate how much money and time a large construction project will require, without detailed drawings or budgets. First, we define the scope: the amount of housing, retail space, parking, schools, hospitals, roads, and infrastructure that need to be built. Then we apply generalized coefficients — like ready-made templates for prices, inflation, risks, and project sequencing. Everything is distributed by year. Revenues are calculated based on a realistic sales plan for apartments or commercial premises, taking into account prevailing market prices. As a result, we get an annual forecast: how much will be spent, how much will be earned, profit or loss, and approximate timelines. Later, when drawings and detailed budgets appear, the framework is refined.

— What practical effect does this methodology bring?
— In urban projects, such a framework can be built within hours or days instead of weeks. It allows for comparing options “on paper” — for example, building from monoliths or panels, or changing the order of relocation phases — and immediately seeing where there is a profit or loss. In large regional projects, such as renovations, the 10–11-year model demonstrates whether everything will pay off under real inflation and phased commissioning. In one of our cases, the total result was several billion rubles in profit, indicating that we could proceed with detailed budgeting and financing options. I successfully applied this methodology in government projects at the Moscow Analytical Center, where the “data + macro-model” approach reduced manual procedures roughly threefold and freed up analysts’ time for analysis instead of copy-paste work. The methodology also proved effective at the investment company Alesium in Cyprus: the path from idea to board memo shortened from several weeks to just a few days — and sometimes hours — and the company concluded several profitable deals.

— You have devoted several public talks to early-stage project evaluation, including at RANEPA’s data analytics events. What did you focus on?
— Participants were interested in how early-stage evaluation of large projects is done worldwide — from high-level methods to data integration. I explained why this combination accelerates management decisions and reduces the risk of budget miscalculations. Another talk focused on regional renovation, using the Yaroslavl case as an example.

— You have also worked on data annotation projects for artificial intelligence. Tell us more about them.
— Data annotation is an engineering product with rules, quality control, and metrics — a kind of textbook for a neural network. It contains examples and correct answers that help the model learn to distinguish typical cases from anomalies. Neural networks perform only as well as the “textbook” you create for them. For Yandex, for instance, I optimized search moderation and anti-fraud. Through improved annotation and pipeline redesign, we achieved an 11% increase in model accuracy, a 11-fold acceleration of the production pipeline, and approximately a 37% reduction in cost. Response time to phishing ads decreased by 1.5 times. In parallel, I worked on projects for major retailers and e-commerce companies that also face issues related to substitution and fraud.

— As data volumes grow, so do security threats. You set up a system to combat cyber fraud and moderate phishing content in search, helping protect millions of Yandex users. What exactly does the bot do?
— My moderation bot’s task is to detect content substitution on landing pages from ad results and fraud schemes. This often occurs when dishonest advertisers create landing pages that closely resemble official company or organization websites, using similar logos, colors, and designs. They may also post fake reviews and ratings that mislead users. The bot checks the consistency between ad creatives and landing pages, flags suspicious content, and sends it for manual review or blocking. As a result, we achieved not only a systematic reduction in hidden fraud — such as fake services or non-existent goods — but also significantly reduced manual moderation workload.

— In your opinion, what specific tasks in data analysis will AI be able to perform in the near future?
— Artificial intelligence will be actively used to automate big data analysis processes, including image recognition, text processing, voice analysis, and more. It can also be used to forecast trends, identify patterns and anomalies in data, enabling companies to make more informed decisions.



Source link