Skip to content

Fulturate/tiktok-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tiktok_scrapper

tiktok_scraper is a Rust library designed to search for TikTok videos using a multi-layered strategy. It prioritizes performance and reliability by utilizing a Redis cache, the official TikTok Research API, and a fallback Selenium WebDriver scraper.

Features

  • Hybrid Search Strategy: Automatically switches between data sources based on availability:
    1. Redis Cache: Returns previously fetched results to minimize latency.
    2. TikTok API: Uses the official Research API (if a token is provided).
    3. Scraper Fallback: Uses a headless browser to scrape results if the API is unavailable.
  • Browser Pooling: Maintains a pool of WebDriver sessions to reduce initialization overhead.
  • Stealth Scraping: Implements "eager" loading strategies and modifies browser arguments to mitigate bot detection.

Prerequisites

Before using this library, ensure the following dependencies are running:

  1. Redis Server: Used for caching search results.
  2. Selenium WebDriver: A Chrome instance controlled via WebDriver. We recommend using Docker to run a compatible Selenium Standalone Chrome instance.

Running Selenium with Docker

Run the following command to start the Selenium Grid with appropriate memory limits and session capabilities:

docker run -d \
  -p 4444:4444 \
  --shm-size="2g" \
  -e SE_NODE_MAX_SESSIONS=NUM \
  -e SE_NODE_OVERRIDE_MAX_SESSIONS=true \
  --name tiktok-chrome \
  selenium/standalone-chrome

Installation

Add the library to your Cargo.toml. If the library is local, use the path dependency:

[dependencies]
tiktok_hybrid = { path = "./tiktok_hybrid" }
tokio = { version = "1", features = ["full"] }

Usage

Below is a basic example of how to initialize the client and perform a search.

use tiktok_hybrid::{TikTokClient, TikTokConfig};
use tokio::signal;

#[tokio::main]
async fn main() {
    // Configure the client
    let config = TikTokConfig {
        redis_url: "redis://127.0.0.1/".to_string(),
        api_token: "".to_string(), // Leave empty to force scraper usage
        webdriver_url: "http://127.0.0.1:4444".to_string(),
        browser_instances: 2,      // Number of concurrent browsers
        search_limit: 10,          // Max videos to retrieve
        cache_ttl_sec: 3600,       // Cache duration in seconds
    };
    
    let client = TikTokClient::new(config)
        .await
        .expect("Failed to initialize TikTok Client");

    println!("Client initialized. Starting search...");

    // Search
    match client.search("linux terminal tips").await {
        Ok(videos) => {
            println!("Found {} videos:", videos.len());
            for video in videos {
                println!("- Title: {}", video.title);
                println!("  URL: {}", video.url);
                println!("  Command: {}", video.download_cmd);
            }
        }
        Err(e) => eprintln!("Search failed: {}", e),
    }

    // Closes browser sessions
    client.shutdown().await;
}

Configuration

The TikTokConfig struct allows for the following customizations:

Field Type Description
redis_url String Connection string for the Redis server.
api_token String Bearer token for the TikTok Research API. If empty, the API step is skipped.
browser_instances usize Number of browser sessions to keep open in the pool.
search_limit usize The maximum number of videos to fetch per query.
cache_ttl_sec u64 Time-to-live for cached results in seconds.

Testing

The project includes integration tests that scraper functionality.

To run the tests (ensure Docker containers are running):

# Run live scraping tests (requires internet connection and WebDriver)
cargo test -- --ignored --test-threads=1

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

A Rust library designed to search for videos on TikTok using both APIs and scraping strategies.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages