- Published on
All Posts
All Posts
- nextjs (14)
- web-scraping (8)
- react (8)
- hono (5)
- scrapy (5)
- supabase (4)
- authjs (3)
- cloudflare (3)
- undetected-chromedriver (3)
- vercel (3)
- react-query (3)
- protobuf (3)
- shadcn (3)
- authentication (2)
- curl_cffi (2)
- clerk (2)
- deployment (2)
- neon (2)
- git (2)
- prisma (2)
- tailwind-css (2)
- zustand (2)
- oauth (1)
- credential (1)
- browser-automation (1)
- free-tier (1)
- cloud-databases (1)
- sql (1)
- nosql (1)
- aws (1)
- planetscale (1)
- drizzle (1)
- postgresql (1)
- giscus (1)
- jotai (1)
- newsletter (1)
- mailerlite (1)
- middleware (1)
- redux (1)
- reford (1)
- english (1)
- ai-integration (1)
- sonner (1)
- s3 (1)
- tiptap (1)
- vscode (1)
- Published on
通常的语言学习是从说开始, 或者背词汇表, 而自然的学习方式是通过模仿(imitation), 并不是记忆语法和结构化的句子. 一旦理解后将更容易模仿并输出- Published on
The CustomLogStats component extends the functionality of Scrapy by pushing collected statistics to a Redis server.- Published on
Sync raw data to S3, including json.gz files while excluding .json files. The process specifies a time window in seconds for file modification, ensuring only recently updated files are synced. This avoid scanning all files, reduce AWS S3 file comparison API requests, and save costs.- Published on
This blog introduces a custom Scrapy pipeline that leverages a JsonLinePipeline to efficiently process and store data.