IT虾米网

HttpClient使用代理IP详解

sanshao 2020年12月03日 编程语言 339 0

在爬取网页的时候,有的网站会有反爬虫措施,导致服务器请求拒接,可以使用代理IP来访问,解决请求拒绝的问题

代理IP分 透明代理、匿名代理、混淆代理、高匿代理

  1、透明代理(Transparent Proxy):透明代理虽然可以“隐藏”IP地址,但是还是可以从HTTP_X_FORWARDED_FOR来查到IP
    REMOTE_ADDR = Proxy IP
    HTTP_VIA = Proxy IP
    HTTP_X_FORWARDED_FOR = Your IP
  2、匿名代理(Anonymous Proxy):匿名代理比透明代理进步了一点:别人只能知道你用了代理,无法知道你是谁
    REMOTE_ADDR = proxy IP
    HTTP_VIA = proxy IP
    HTTP_X_FORWARDED_FOR = proxy IP
  3、混淆代理(Distorting Proxies):如果使用了混淆代理,别人还是能知道你在用代理,但是会得到一个假的IP地址,伪装的更逼真
    REMOTE_ADDR = Proxy IP
    HTTP_VIA = Proxy IP
    HTTP_X_FORWARDED_FOR = Random IP address
  4、高匿代理(Elite proxy或High Anonymity Proxy):高匿代理让别人根本无法发现你是在用代理
    REMOTE_ADDR = Proxy IP
    HTTP_VIA = not determined
    HTTP_X_FORWARDED_FOR = not determined

import org.apache.http.HttpEntity; 
import org.apache.http.HttpHost; 
import org.apache.http.client.config.RequestConfig; 
import org.apache.http.client.methods.CloseableHttpResponse; 
import org.apache.http.client.methods.HttpGet; 
import org.apache.http.impl.client.CloseableHttpClient; 
import org.apache.http.impl.client.HttpClients; 
import org.apache.http.util.EntityUtils; 
import org.junit.Test; 
/** 
 * @author test 
 * @Title: JunitHttpClient 
 * @ProjectName JunitHttpClient 
 * @Description: TODO 
 * @date 2018/12/1216:07 
 */ 
public class JunitHttpClient { 
 
    @Test 
    public void test()throws Exception{ 
        // 创建httpget实例 
        HttpGet httpGet=new HttpGet("https://www.****.com"); 
        CloseableHttpClient client = setProxy(httpGet, "192.168.1.1", 8888); 
        //设置请求头消息 
        httpGet.setHeader("User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"); 
        // 执行http get请求  也可以使用psot 
        CloseableHttpResponse response=client.execute(httpGet); 
        // 获取返回实体 
        if (response != null){ 
            HttpEntity entity = response.getEntity(); 
            if (entity != null){ 
                System.out.println("网页内容为:"+ EntityUtils.toString(entity,"utf-8")); 
            } 
        } 
        //关闭response 
        response.close(); 
        //关闭httpClient 
        client.close(); 
 
    } 
    /** 
     * 设置代理 
     * @param httpGet 
     * @param proxyIp 
     * @param proxyPort 
     * @return 
     */ 
    public CloseableHttpClient setProxy(HttpGet httpGet,String proxyIp,int proxyPort){ 
        // 创建httpClient实例 
        CloseableHttpClient httpClient= HttpClients.createDefault(); 
        //设置代理IP、端口 
        HttpHost proxy=new HttpHost(proxyIp,proxyPort,"http"); 
        //也可以设置超时时间   RequestConfig requestConfig = RequestConfig.custom().setProxy(proxy).setConnectTimeout(3000).setSocketTimeout(3000).setConnectionRequestTimeout(3000).build(); 
        RequestConfig requestConfig=RequestConfig.custom().setProxy(proxy).build(); 
        httpGet.setConfig(requestConfig); 
        return httpClient; 
    } 
}

 

发布评论

分享到:

IT虾米网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!

linux 刷新环境变量详解
你是第一个吃螃蟹的人
发表评论

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。