Hermes Technology
1 Overview
1.1 Domain-Specific Models
In the current technological and societal context, the development of large models has evolved in two main directions:
Evolution of general-purpose models: For example, GPT4 to GPT5, where GPT5 is claimed to have learned from all the videos on the internet. The evolution of these large models requires substantial computational power and resource investment.
Construction of domain-specific models: A large amount of deep web data and private domain data are beyond the reach of general-purpose models. The evolution of these models cannot meet the speed of cognitive improvement needed in many specific areas to support productivity growth. Therefore, building vertical domain models with sufficient depth of knowledge in specialized areas is a critical direction. Examples include bloomberggpt and mathgpt.
What can domain-specific models achieve?
Domain-specific models establish a deep understanding of a specific area on the basis of general models, enabling better reception, processing, summarization, and analysis of domain information, using domain tools to complete domain tasks.
For example, a layoff of 10,000 people is considered a negative event by general-purpose models, but bloomberggpt (a financial model) might offer a different perspective, sometimes considering it positive as it could lead to an increase in a company's stock price or investor confidence.
Additionally, in many vertical domain scenarios, much of the capacity of large models is redundant; for instance, writing blockchain code likely does not require high-level article writing skills. Therefore, in domain-specific production scenarios, the more important goals are to enhance the depth and precision of domain knowledge and selectively discard some other capabilities, leading to performance improvement and cost optimization. This approach is essential to truly enhance productivity.
1.2 Utilizing Public Sentiment Data
Large models have significant advantages in analyzing public sentiment data. By modeling and fitting latent variables and linear relationships within the data, large models can deeply mine and reveal the patterns and trends behind public sentiment. Especially in sentiment analysis and emotion recognition, large models, through precise modeling of emotional features, can effectively capture and understand the emotional tendencies and fluctuations within public sentiment, providing support for decision-making.
Additionally, by integrating data mining techniques such as cluster analysis, large models can comprehensively analyze public sentiment data from multiple dimensions, offering in-depth analytical results and uncovering potential data patterns. This combined application of technologies enables large models to excel in handling complex public sentiment data, providing users with comprehensive and accurate data support.
2 Architecture of Training

2.1 Foundation Model
NLP-LLM (130B): A large foundational model with 130 billion parameters, responsible for processing text data. This undoubtedly involves higher costs compared to the recently popular 6B/13B models on the market. We do not blindly pursue large parameter counts in models, but we believe in the emergent capabilities of large models supported by significant amounts of data.
CV-LLM: A large model for computer vision, handling image and video data.
2.2 Model Capability Outputs
Hermes encompasses domains such as Natural Language Processing (NLP) and Computer Vision (CV), including four core sub-models: the general model, structural cognition model, demand cognition model, and task process model.
General Model: This model includes NLP capabilities such as conversation, creation, and question-answering, as well as CV capabilities like recognition, understanding, and generation, with cross-modal capabilities. Hermes is trained on vast amounts of text, images, audio, and video data using self-supervised learning and transfer learning methods, enabling it to possess multimodal understanding and generation capabilities.
Structural Cognition Model: This covers expert knowledge in fields such as blockchain, smart contracts, cryptocurrencies, decentralized applications (Dapps), decentralized finance (DeFi), and decentralized autonomous organizations (DAOs). By incorporating domain-specific knowledge graphs and professional datasets, along with semi-supervised learning and fine-tuning methods, Hermes can deeply understand and apply knowledge from these domains.
Demand Cognition Model: Focused on the needs of project development and operations, data analysis and prediction, privacy and security. Hermes utilizes demand-specific datasets and task-driven training approaches, employing reinforcement learning and multi-task learning methods, allowing the model to cater to professional applications in various demand scenarios.
Task Process Model: Supports tasks such as blockchain transactions, market monitoring, trade strategy generation, and backtesting. Through simulating real-world task scenarios and using deep reinforcement learning methods, Hermes efficiently completes complex task processes, providing automated and intelligent solutions.
2.3 Data Sets
a) General Data
This includes the accumulation of open-source datasets utilized by the Insight team in past NLP and large model research, as well as supplemental datasets specific to the cryptocurrency sector and scenarios. These include:
Text: News articles, research papers, social media content, etc.
Images: Market trend graphs, technical analysis charts, etc.
Audio: Market commentary, expert interviews, etc.
Video: Market analysis videos, educational videos, etc.
Code: Smart contract code, trading algorithms, etc.
etc.
b) Domain Data
This includes existing data accumulated by the Insight team across various projects and new data acquired through original data acquisition pathways, primarily including cryptocurrency project data, on-chain data, public sentiment data, etc.; as well as data obtained through collaborations with professional organizations, such as industry research data (e.g., Bloomberg has accumulated 40 years of professional research reports, which form part of the core data for bloomberggpt) and security data (which is typically hard to obtain from the surface web and represents typical domain-specific data).
Project Data: Project whitepapers, team backgrounds, technical documentation, etc.
On-Chain Data: Transaction data, wallet addresses, miner data, etc.
Crypto Sentiment: Social media discussions, news reports, forum discussions, etc.
Industry Reports: In-depth analysis reports from research institutions.
Security Data: Vulnerability reports, security recommendations related to crypto security, etc.
etc.
We are well aware that the web3 industry is characterized by a data explosion, and we welcome companies from various sub-sectors within the industry to join our domain data collaboration, contributing to the co-creation of the model.
2.4 Training Processes

a) Pre-training
During the pre-training phase, Hermes employs self-supervised learning methods to initially train on a vast amount of data through general education. The specific steps are as follows:
Data Collection and Cleaning: Collect a large volume of text, images, audio, video, code, and other data from the internet, open-source datasets, industry reports, etc. Use data cleaning techniques to remove noise and irrelevant information, ensuring data quality.
Data Preprocessing: Preprocess different types of data, such as text tokenization, image normalization, and audio feature extraction. Insight supports a lakehouse architecture and an AI platform to preprocess multi-source heterogeneous data.
Self-Supervised Learning: Train the model to learn the inherent structure and features of the data through self-supervised learning methods. For example, use BERT's Masked Language Model (MLM) and GPT's Autoregressive (AR) model for text pre-training, and use contrastive learning methods for image and video pre-training.
b) Fine-tuning
Building on pre-training, Hermes further undergoes specialized training through semi-supervised learning for fine-tuning. The specific steps are as follows:
Data Augmentation: Use data augmentation techniques such as random cropping, rotation, and flipping to increase the diversity of training data and improve the model's generalization ability.
Specialized Data Integration: Integrate industry-specific datasets, such as cryptocurrency market data, on-chain transaction data, and smart contract code libraries. Use domain knowledge graphs to enhance the model's knowledge structure.
Semi-Supervised Learning: Train the model by combining labeled and unlabeled data. Use Generative Adversarial Networks (GANs) to generate high-quality pseudo-label data, improving the model's learning effectiveness.
Hyperparameter Optimization: Adjust the model's hyperparameters using methods like Grid Search and Bayesian Optimization to find the optimal configuration.
3 Specialized Training
During Hermes' training process, the core capabilities targeting user and product needs underwent independent specialized training. These include: sentiment analysis, trading systems, crypto privacy, and security.
3.1 Sentiment Analysis

Based on the characteristics of public sentiment content in the cryptocurrency field, Hermes has designed and trained unique processes for sentiment analysis. This forms the core technical chain for generating a crypto sentiment analysis indicator system. The process includes the following features:
Precise Cryptocurrency Identification: Achieving precise cryptocurrency identification based on a knowledge graph of cryptocurrency entity relationships, moving beyond simple text matching.
Multi-dimensional Sentiment Analysis: Not limited to simple sentiment polarity classification (such as positive, negative, neutral), but also includes refined analysis of emotion intensity and sentiment inclination.
Detailed Account Influence Evaluation: Evaluating the influence of various social media accounts by analyzing their follower count, interaction frequency, and historical posting effectiveness to determine which accounts have significant influence in the cryptocurrency market, identifying key opinion leaders (KOLs) and potential market drivers.
Precise Content Dissemination Analysis: Analyzing the dissemination path and effectiveness of specific content across different platforms. Through content dissemination analysis, identifying which information has gained widespread attention in the market and evaluating its impact on market sentiment and price volatility.
Key Event Extraction: Summarizing each trending event based on text content rather than traditionally relying on “#”.
During the analysis of public sentiment content, Hermes introduces a unique Topic Block model. In the social sentiment network, user content does not exist in isolation. By intelligently establishing topic blocks from horizontal and vertical content between users, and analyzing, counting, and predicting them as units, more reasonable and accurate results are obtained. Examples of topic blocks include contextual conversations among users in a community; Twitter tweets, their interactive tweets, and related comments.
3.2 Trading System
Hermes bases its learning on data from traders on major exchanges and high-quality on-chain trader wallets, combined with the Insight market indicator system. This involves professional and unique learning of trading systems, including setting trading goals, planning risk management, controlling position management, analyzing market methods, and maintaining trading discipline.
Hermes supports the custom generation and backtesting of multi-dimensional integrated strategies, including:
Indicator Triggers: Triggers based on indicators in the indicator system reaching set thresholds, for example, Volume of Mention exceeding n.
Comparison Triggers: Triggers based on indicators in the indicator system reaching certain conditions, for example, Volume of Mention surpassing BTC's Volume of Mention.
Event Triggers: Triggers based on specific events in the sentiment analysis system, for example, a tweet by CZ mentioning a specific project.
Pattern Triggers: Triggers based on technical patterns approaching certain detection shapes, for example, a daily K-line of a project reaching a bearish divergence.
Hermes allows users to freely express their creativity by creating their own trading systems and quickly providing performance detection and optimization suggestions using actual backtesting data. Users can also share their excellent trading systems with Hermes, which can use professional indicators and models to fill in the gaps in market analysis, thereby improving the success rate of the trading systems.
Hermes itself is continually engaged in long-term deep learning, providing a series of quantitative strategies and constantly backtesting and iterating trading systems. Users can fully rely on Hermes' quantitative signals for trading.
Last updated