fix(route/zhihu): obtain __zse_ck from a browser session#22319
Open
DzmingLi wants to merge 2 commits into
Open
fix(route/zhihu): obtain __zse_ck from a browser session#22319DzmingLi wants to merge 2 commits into
DzmingLi wants to merge 2 commits into
Conversation
`__zse_ck` is computed at runtime by Zhihu's JS from the device fingerprint and `d_c0`, rotates every few days, and is cross-checked against `d_c0` by the backend, so it has to come from a real browser session. Drive a browser seeded with the configured cookies (including the `z_c0` login cookie that most endpoints now require) to compute a fresh, consistent `__zse_ck`, harvest the cookie jar, and cache it for 30 minutes so it is refreshed automatically. Recommended ZHIHU_COOKIES: "d_c0=...; z_c0=..." (omit `__zse_ck`).
The user's HTML homepage (www.zhihu.com/people/:id) is now rate-limited (403) more aggressively than the API, which broke /zhihu/posts on the profile fetch even though the article-list API works. Read the profile (name, headline, avatar) from /api/v4/members/:id instead.
Contributor
Auto ReviewNo clear rule violations found in the current diff. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Involved Issue / 该 PR 相关 Issue
Close #20402
Example for the Proposed Route(s) / 路由地址示例
New RSS Route Checklist / 新 RSS 路由检查表
PuppeteerNote / 说明
Two related fixes for Zhihu's tightened anti-crawler:
1.
__zse_ckfrom a browser session.__zse_ckis generated at runtime by Zhihu's JS from the device fingerprint andd_c0, rotates every few days, and is cross-checked againstd_c0by the backend, so it has to come from a real browser session. This drives a headless browser, seeded with the configured cookies, to compute a fresh__zse_ck, harvests the cookie jar, and caches it for 30 minutes so it is refreshed automatically.getSignedHeadersignsx-zse-96with the samed_c0that is sent.Most content (column items, answers, activities) now also requires the
z_c0login cookie. Recommended config:ZHIHU_COOKIES=d_c0=...; z_c0=...(omit__zse_ckso it is refreshed automatically; a pinned__zse_ckis trusted as-is and will expire). A fully pinnedd_c0+__zse_ckpair is still honored, so no browser is needed in that case.2.
/zhihu/postsprofile via API. The route scraped the user's HTML homepage (www.zhihu.com/people/:id) for the feed title/avatar; that page is now rate-limited (403) more aggressively than the API, breaking the route even though the article-list API works. It now reads the profile from/api/v4/members/:idinstead.Deployed and verified end-to-end:
/zhihu/zhuanlan/<id>,/zhihu/people/activities/<id>and/zhihu/posts/people/<id>return 200 with full items; the harvested cookie is served from cache within the TTL (cold ~18s for the browser launch, warm ~2s).Supersedes #21321.