Hey guys!
Are you all right?

In this post I would like to share a solution that I needed to develop in my company to carry out web scraping and extract data from branches on the Correios website (http://www2.correios.com.br/sistemas/agencias/). As you can imagine, to achieve this solution and bring the data into my database quickly, I used my old companion SQLCLR to carry out this not-so-simple task, as the post office website contained some errors in the site's XHTML and this ended up complicating my life a little when dealing with the data.

To be able to process this data, I will use regular expression techniques (Regexp or Regex), which I demonstrated in my last post SQL Server – How to use regular expressions (RegExp) in your database, therefore, I recommend reading this Regexp article before this article here.

Information on the source website:

Information returned by the Stored Procedure using Web Scraping:

How the solution works in theory

I will try to summarize the steps to perform web scraping and explain how I did it to achieve this result.

  • The first step is to identify the URL used to return data via Ajax on the website

  • Now that I have identified the URL, the next step is to ensure that the city name does not have accents (Ex: Vitória). To do this, I will use the fncRemove_Acentuacao function (I will release the codes below).
  • Another important point is to encode the URL to transform spaces and other characters as HTML entities. (Ex: Space = %20). To do this, I will use the fncCodifica_URL function
  • Now I can create the complete URL to send the HttpRequest request. To do this, I will use the Stored Procedure stpWs_Requisicao, which performs the request and returns the result in an output string
  • The return of this request is the HTML code returned by the website, which is in this format:

  • As you can see, it is not a trivial task to deal with this XHTML, especially since it contains some elements that are open and not closed, generating an error when trying to convert it to XML and process it through SQL Server. Therefore, I will treat the return using another technique: Regular expression (RegExp)
  • I use the mask “(.*?)” in the fncRegex_Match function to return all values ​​that are within the tags td. In addition to this, I also use the fncRemove_Html_String functions to transform an HTML string to text and also the fncRecupera_Letras_Sem_Caractere_Especial function to remove any special characters in the string.
  • From there, I will apply some filters to separate and categorize the results of these treatments and thus be able to correctly classify the information and return this data in table format

Solution source code

Now that I've explained how the solution works in theory, it's time to apply it and demonstrate it in practice.

Source code for the SQLCLR assembly, which will be used in this article (T-SQL):

USE [dirceuresende]
GO

ALTER DATABASE [dirceuresende] SET TRUSTWORTHY ON
GO


IF (OBJECT_ID('dbo.fncCodifica_URL') IS NOT NULL) DROP FUNCTION [dbo].[fncCodifica_URL]
GO

IF (OBJECT_ID('dbo.fncRecupera_Letras_Sem_Caractere_Especial') IS NOT NULL) DROP FUNCTION [dbo].[fncRecupera_Letras_Sem_Caractere_Especial]
GO

IF (OBJECT_ID('dbo.fncRegex_Match') IS NOT NULL) DROP FUNCTION [dbo].[fncRegex_Match]
GO

IF (OBJECT_ID('dbo.fncRemove_Acentuacao') IS NOT NULL) DROP FUNCTION [dbo].[fncRemove_Acentuacao]
GO

IF (OBJECT_ID('dbo.fncRemove_Html_String') IS NOT NULL) DROP FUNCTION [dbo].[fncRemove_Html_String]
GO

IF (OBJECT_ID('dbo.stpWs_Requisicao') IS NOT NULL) DROP PROCEDURE [dbo].[stpWs_Requisicao]
GO

IF (EXISTS(SELECT NULL FROM sys.assemblies WHERE name = 'Webscraping_Correios')) DROP ASSEMBLY [Webscraping_Correios]
GO

CREATE ASSEMBLY [Webscraping_Correios]
    AUTHORIZATION [dbo]
    FROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C010300B1B6045B0000000000000000E00022200B0130000018000000060000000000000A370000002000000040000000000010002000000002000004000000000000000600000000000000008000000002000000000000030060850000100000100000000010000010000000000000100000000000000000000000B83600004F00000000400000D802000000000000000000000000000000000000006000000C000000803500001C0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E7465787400000010170000002000000018000000020000000000000000000000000000200000602E72737263000000D80200000040000000040000001A0000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000001E00000000000000000000000000004000004200000000000000000000000000000000EC360000000000004800000002000500B4270000CC0D000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001B3004006F01000001000011000F02280700000A2D090F02280800000A2B0572010000700A0F01280700000A2D090F01280800000A2B0572030000700B0F00280800000A0C7E0900000A0D0F03280700000A2D090F03280800000A2B05720B00007013041104280A00000A130611062C07720B000070130408280B00000A740800000113051105076F0C00000A00077217000070280D00000A2C0B066F0E00000A16FE022B0116130711072C5100061308280F00000A11086F1000000A1309110511098E696A6F1100000A0011056F1200000A130A00110A11091611098E696F1300000A00110A6F1400000A0000DE0D110A2C08110A6F1500000A00DC0011056F1600000A740A000001130B00110B6F1700000A130C00110C14FE03130D110D2C2B00110C1104281800000A731900000A130E00110E6F1A00000A0D00DE0D110E2C08110E6F1500000A00DC0000DE0D110C2C08110C6F1500000A00DC00DE0D110B2C08110B6F1500000A00DC0E0409281B00000A81070000012A00013400000200C9001BE4000D00000000020027010C33010D0000000002000A013A44010D00000000020000015454010D000000002202281C00000A002A000000133003004500000002000011000F00280700000A2D310F00280800000A722100007072270000706F1D00000A7237000070723B0000706F1D00000A281E00000A281B00000A2B057E1F00000A0A2B00062A000000133002008200000003000011000F00280700000A0C082C0914281B00000A0D2B6B0F00280800000A186F2000000A0A732100000A0B000613041613052B35110411056F2200000A1306001106282300000A130711071BFE0116FE01130811082C0B000711066F2400000A260000110517581305110511046F0E00000A32C0076F2500000A281B00000A0D2B00092A00001B3002006A00000004000011000F00280700000A2D090F01280700000A2B01170A062C087E1F00000A0B2B4800000F00280800000A0F01280800000A282600000A0C086F2700000A2D077E1F00000A2B16086F2800000A166F2900000A6F2A00000A281B00000A0BDE0A0D007E1F00000A0BDE00072A000001100000000021003D5E000A10000001133004007E000000050000110002280A00000A0B072C04140C2B6D020A06282B00000A0A030D092C380006723F000070282C00000A17282D00000A0A06724B000070282C00000A17282D00000A0A067259000070282C00000A17282D00000A0A00067263000070727100007017282D00000A0A0672750000707E0900000A282E00000A0A060C2B00082A00001B3003006A03000006000011000F00280700000A130511052C0C7E1F00000A1306384D0300000F01282F00000A130711072C0C7E1F00000A130638340300000F03282F00000A2D090F03283000000A2B01160A0F02280700000A2D090F02280800000A2B0572010000700B0F01283000000A0C062D140F00280800000A186F2000000A6F3100000A2B0C0F00280800000A6F3100000A0D732100000A130400097E02000004252D17267E01000004FE060B000006733200000A258002000004280100002B6F3400000A1308387802000011086F3500000A13090008130A110A39550100000006130B110B39EB0000000011091F41320911091F5A3EAD00000011091F61320911091F7A3E9E00000011091F30320911091F393E8F00000011091F203B86000000110920C00000003209110920CF0000003174110920D10000003209110920D60000003162110920D90000003209110920DD0000003150110920E00000003209110920E4000000313E110920E70000003209110920EF000000312C110920F10000003209110920F6000000311A110920F9000000320E110920FD000000FE0216FE012B01162B0117130C110C2C0C110411096F2400000A262B18076F0E00000A16FE02130D110D2C091104076F3600000A26002B590011091F41320611091F5A312011091F61320611091F7A311411091F30320611091F39310811091F20FE012B0117130E110E2C0C110411096F2400000A262B18076F0E00000A16FE02130F110F2C091104076F3600000A260000380E01000000061310111039C40000000011091F20320911091F7E3E86000000110920C00000003209110920CF0000003174110920D10000003209110920D60000003162110920D90000003209110920DD0000003150110920E00000003209110920E4000000313E110920E70000003209110920EF000000312C110920F10000003209110920F6000000311A110920F9000000320E110920FD000000FE0216FE012B01162B0117131111112C0C110411096F2400000A262B18076F0E00000A16FE02131211122C091104076F3600000A26002B3E0011091F20320B11091F7EFE0216FE012B0116131311132C0C110411096F2400000A262B18076F0E00000A16FE02131411142C091104076F3600000A2600000011086F3700000A3A7CFDFFFFDE0D11082C0811086F1500000A00DC11046F2500000A281B00000A13062B0011062A0000411C000002000000BF0000008B0200004A0300000D000000000000002E730A00000680010000042A3603282300000A1BFE0116FE012A000042534A4201000100000000000C00000076342E302E33303331390000000005006C0000008C040000237E0000F80400005406000023537472696E6773000000004C0B00008400000023555300D00B0000100000002347554944000000E00B0000EC01000023426C6F62000000000000000200000157150208090A000000FA013300160000010000002700000004000000020000000B00000010000000370000000A0000000600000003000000010000000400000001000000010000000000F40201000000000006000302A50406002302A5040600C40192040F00C50400000600AD054B030A00D8013E040A008702D4040E00E305B40506003B039B000E007201B405060018049B000A00EE013E0406003004FB0506001C066F030E00A3020305060094034B030A006403D40406004A00EB000600A901A504060058004B03060093024B030E00E705B40506005802FB05060040014B030E007601B405060025049B000E00C6024B0306005203FB050600ED036F030E00070603050E00060403050E00840303050E006A0103050E003806B4050600CB054B030E004A050305120035010C0406003C00EB0006006B042205000000006C00000000000100010001001000E9040000150001000100010010003505000015000100030003211000E7000000150001000900360068006E011600010072015020000000009600C2037A01010000220000000086188504060006000C220000000096008400890106006022000000009600AD0389010700F0220000000096009A02900108007823000000009600610299010A000424000000009600CA029F010C00002200000000861885040600100098270000000091188B04AC0110000022000000008618850406001000A4270000000083000A00B001100000000100180300000200E303000003006C05000004009E0302000500A50000000100940000000100FD0300000100FD0300000200C70000000100750000000200B70000000100FD0300000200D303000003005904000004007A0500000100A602090085040100110085040600190085040A0031008504060061008504060099008504060039000D032800390041022C00A9004B063000A90043063300B1009C013800B10017013E00A9002C064300A900A9024900B9005F004D00B900FA045200B100B4025800B10031035D004900A301620049008E010600C10094010600B10082016A00C9001F035D00B90055026F00590085047500D10006012C003900BF057D00290085040600A90022018800D90077028E00390013039300A9004B02A600690085040600A9009705AC00E9001906B10069001001B700290091022C00F100A302C700F900A105280079008C05CE0001014203D400090141022C0011012A018E0019014C01E100F1002201E500F1002201EF0089000D032800890041022800A9000D0615010C0085042101290158012701140077044B011C00D7055A01690010015F013901F205280020002300E6012E000B00B5012E001300BE012E001B00DD0160002B00E60180002B00E60183003300E601A0002B00E601C0002B00E601E0002B00E601100083009700BD00DA00F6001A0144015401048000000000000000000000000000000000570500000400000000000000000000006501DE00000000000400000000000000000000006501D2000000000004000000000000000000000065014B030000000004000000000000000000000065015E0100000000040003006700400100000000003C3E395F5F345F30003C666E6352656375706572615F4C65747261735F53656D5F4361726163746572655F457370656369616C3E625F5F345F300049456E756D657261626C6560310049456E756D657261746F7260310046756E636032006765745F55544638003C3E39003C4D6F64756C653E0044735F537472696E675F48544D4C00666E63436F6469666963615F55524C0044735F55524C0053797374656D2E494F0044735F5265746F726E6F5F4F555450555400466C5F5175656272615F4C696E68610044735F4D6173636172610053797374656D2E44617461006D73636F726C6962003C3E630053797374656D2E436F6C6C656374696F6E732E47656E657269630052656164546F456E6400417070656E64007365745F4D6574686F64005265706C6163650048746D6C4465636F646500456E756D657261626C650049446973706F7361626C65006765745F4E65774C696E650057686572650053797374656D2E436F726500436170747572650048747470576562526573706F6E736500476574526573706F6E736500436C6F736500446973706F73650043726561746500577269746500436F6D70696C657247656E6572617465644174747269627574650044656275676761626C654174747269627574650053716C50726F6365647572654174747269627574650053716C46756E6374696F6E41747472696275746500436F6D70696C6174696F6E52656C61786174696F6E734174747269627574650052756E74696D65436F6D7061746962696C697479417474726962757465006765745F56616C7565004E6F726D616C697A6500476574456E636F64696E6700666E6352656D6F76655F48746D6C5F537472696E6700457363617065557269537472696E670053716C537472696E6700546F537472696E6700666E6352656765785F4D61746368006765745F4C656E677468007365745F436F6E74656E744C656E6774680055726900666E6352656375706572615F4C65747261735F53656D5F4361726163746572655F457370656369616C005765627363726170696E675F436F727265696F732E646C6C006765745F49734E756C6C0044735F55726C00476574526573706F6E736553747265616D004765745265717565737453747265616D006765745F4974656D0053797374656D004E6F726D616C697A6174696F6E466F726D0053716C426F6F6C65616E0053797374656D2E476C6F62616C697A6174696F6E0047726F7570436F6C6C656374696F6E00457863657074696F6E0044735F436F64696669636163616F00666E6352656D6F76655F4163656E74756163616F0073747057735F5265717569736963616F00466C5F416C66616E756D657269636F0044735F4D65746F646F0043686172556E69636F6465496E666F0044735F546578746F0047726F75700053797374656D2E4C696E710053747265616D526561646572005465787452656164657200537472696E674275696C646572004D6963726F736F66742E53716C5365727665722E5365727665720044735F537562737469747569725F506F720049456E756D657261746F7200476574456E756D657261746F72002E63746F72002E6363746F720053797374656D2E446961676E6F73746963730053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300446562756767696E674D6F6465730053797374656D2E446174612E53716C54797065730053746F72656450726F636564757265730047657442797465730053797374656D2E546578742E526567756C617245787072657373696F6E730053797374656D2E436F6C6C656374696F6E730055736572446566696E656446756E6374696F6E730052656765784F7074696F6E73005765627363726170696E675F436F727265696F730044735F506172616D6574726F7300466C5F4163656974615F4163656E746F73006765745F47726F757073006765745F4368617273006765745F53756363657373004F626A6563740053797374656D2E4E6574006F705F496D706C6963697400456E7669726F6E6D656E74006765745F43757272656E74004874747057656252657175657374004D6F76654E6578740053797374656D2E5465787400526567657800546F43686172417272617900476574556E69636F646543617465676F7279006F705F457175616C697479005765625574696C6974790049734E756C6C4F72456D7074790000000000010007470045005400000B5500540046002D003800010950004F005300540000055C005C00000F68007400740070003A002F002F0000035C0000032F00000B3C00620072002F003E00000D3C006200720020002F003E0000093C00620072003E00000D26006E006200730070003B0000032000000B3C002E002A003F003E0000000000E8F81B1ABDCAAF498C6FFAE0314C37CE0004200101080320000105200101111117070F0E0E0E0E0E122102020E1D0512251229122502122D032000020320000E02060E040001020E05000112590E042001010E050002020E0E03200008040000125D0520011D050E042001010A0420001225072003011D0508080420001265050001125D0E072002011225125D050001111D0E040701111D0520020E0E0E0400010E0E0306111D0E07090E123502111D0E08031139020520010E1171042001030805000111390305200112350309070402111D123D1241060002123D0E0E052000128081052001127D080607040E020E020300000E0900040E0E0E0E1180910600030E0E0E0E1E0715020E021D03123502111D0215124901030302020202020202020202020420001D0306151251020302052002011C181810010215128099011E0015128099011E00151251021E0002030A010306151280990103082000151249011300051512490103042000130005200112350E08B77A5C561934E0890306121007061512510203020E000501111D111D111D111D10111D060001111D111D080002111D111D111D0500020E0E020C0004111D111D1145111D11450300000104200102030801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F77730108010007010000000004010000000000000000B1B6045B00000000020000001C0100009C3500009C17000052534453A5E0CF413234D0468D51838C5CDECBDE01000000633A5C75736572735C646966696C5C646F63756D656E74735C76697375616C2073747564696F20323031375C50726F6A656374735C5765627363726170696E675F436F727265696F735C5765627363726170696E675F436F727265696F735C6F626A5C44656275675C5765627363726170696E675F436F727265696F732E7064620000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000E03600000000000000000000FA360000002000000000000000000000000000000000000000000000EC360000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF2500200010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001001000000018000080000000000000000000000000000001000100000030000080000000000000000000000000000001000000000048000000584000007C02000000000000000000007C0234000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE00000100000000000000000000000000000000003F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B004DC010000010053007400720069006E006700460069006C00650049006E0066006F000000B801000001003000300030003000300034006200300000002C0002000100460069006C0065004400650073006300720069007000740069006F006E000000000020000000300008000100460069006C006500560065007200730069006F006E000000000030002E0030002E0030002E003000000052001900010049006E007400650072006E0061006C004E0061006D00650000005700650062007300630072006100700069006E0067005F0043006F0072007200650069006F0073002E0064006C006C00000000002800020001004C006500670061006C0043006F0070007900720069006700680074000000200000005A00190001004F0072006900670069006E0061006C00460069006C0065006E0061006D00650000005700650062007300630072006100700069006E0067005F0043006F0072007200650069006F0073002E0064006C006C0000000000340008000100500072006F006400750063007400560065007200730069006F006E00000030002E0030002E0030002E003000000038000800010041007300730065006D0062006C0079002000560065007200730069006F006E00000030002E0030002E0030002E0030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000C0000000C3700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    WITH PERMISSION_SET = EXTERNAL_ACCESS;


GO

PRINT N'Creating [dbo].[fncCodifica_URL]...';
GO

CREATE FUNCTION [dbo].[fncCodifica_URL]
(@Ds_URL NVARCHAR (MAX) NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncCodifica_URL]


GO
PRINT N'Creating [dbo].[fncRecupera_Letras_Sem_Caractere_Especial]...';

GO


CREATE FUNCTION [dbo].[fncRecupera_Letras_Sem_Caractere_Especial]
(@Ds_Texto NVARCHAR (MAX) NULL, @Fl_Alfanumerico BIT NULL, @Ds_Substituir_Por NVARCHAR (MAX) NULL, @Fl_Aceita_Acentos BIT NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRecupera_Letras_Sem_Caractere_Especial]


GO
PRINT N'Creating [dbo].[fncRegex_Match]...';
GO

CREATE FUNCTION [dbo].[fncRegex_Match]
(@Ds_Texto NVARCHAR (MAX) NULL, @Ds_Mascara NVARCHAR (MAX) NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRegex_Match]


GO
PRINT N'Creating [dbo].[fncRemove_Acentuacao]...';


GO

CREATE FUNCTION [dbo].[fncRemove_Acentuacao]
(@Ds_Texto NVARCHAR (MAX) NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRemove_Acentuacao]


GO
PRINT N'Creating [dbo].[fncRemove_Html_String]...';


GO

CREATE FUNCTION [dbo].[fncRemove_Html_String]
(@Ds_String_HTML NVARCHAR (MAX) NULL, @Fl_Quebra_Linha BIT NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRemove_Html_String]


GO
PRINT N'Creating [dbo].[stpWs_Requisicao]...';
GO


CREATE PROCEDURE [dbo].[stpWs_Requisicao]
@Ds_Url NVARCHAR (MAX) NULL, @Ds_Metodo NVARCHAR (MAX) NULL, @Ds_Parametros NVARCHAR (MAX) NULL, @Ds_Codificacao NVARCHAR (MAX) NULL, @Ds_Retorno_OUTPUT NVARCHAR (MAX) NULL OUTPUT
AS EXTERNAL NAME [Webscraping_Correios].[StoredProcedures].[stpWs_Requisicao]
GO

PRINT N'Update complete.';

stpBusca_Agencias_Correios source code (T-SQL):
View source code

USE [dirceuresende]
GO

IF (OBJECT_ID('dbo.stpBusca_Agencias_Correios') IS NULL) EXEC('CREATE PROCEDURE dbo.stpBusca_Agencias_Correios AS SELECT 1')
GO

ALTER PROCEDURE dbo.stpBusca_Agencias_Correios (
    @Ds_Cidade VARCHAR(50) = 'Vitória',
    @Sg_UF CHAR(2) = 'ES'
)
AS BEGIN


    -- DECLARE @Ds_Cidade VARCHAR(50) = 'Vitória', @Sg_UF CHAR(2) = 'ES'

    DECLARE 
        @Ds_Retorno_OUTPUT NVARCHAR(MAX),
        @Url VARCHAR(MAX)


    SET @Sg_UF = UPPER(@Sg_UF)
    SET @Ds_Cidade = dbo.fncCodifica_URL(UPPER(dbo.fncRemove_Acentuacao(@Ds_Cidade)))

    
    SET @Url = 'http://www2.correios.com.br/sistemas/agencias/inc/divDetalheAgencia.cfm?uf=' + @Sg_UF + '&municipio=' + @Ds_Cidade + '%20%20%20%20%20%20%20&bairro=&horario=&atendimento=&_cf_containerId=divDetalheAgencia&_cf_nodebug=true&_cf_nocache=true&_cf_clientid=D5982645C0BADD54FDD972B4246C6C48&_cf_rc=30'

    
    EXEC dbo.stpWs_Requisicao
        @Ds_Url = @Url,                                 -- nvarchar(max)
        @Ds_Metodo = N'GET',                              -- nvarchar(max)
        @Ds_Parametros = N'',                          -- nvarchar(max)
        @Ds_Codificacao = N'iso-8859-1',                         -- nvarchar(max)
        @Ds_Retorno_OUTPUT = @Ds_Retorno_OUTPUT OUTPUT -- nvarchar(max)

    
    DECLARE @Contador INT = 1, @Total INT = LEN(@Ds_Retorno_OUTPUT), @Resultado VARCHAR(MAX) = '', @String VARCHAR(MAX), @contaLinhas INT = 0, @StringSemHTML VARCHAR(MAX)
    DECLARE @Retorno TABLE ( id INT IDENTITY(1,1) NOT NULL, texto VARCHAR(MAX) NOT NULL, ranking INT NOT NULL, id_grupo INT NULL )

    WHILE(@Resultado IS NOT NULL)
    BEGIN
    
        SET @String = SUBSTRING(@Ds_Retorno_OUTPUT, @Contador, @Total)
        SET @Resultado = dbo.fncRegex_Match(@String, '<td .*?>(.*?)</td>')
        SET @StringSemHTML = LTRIM(RTRIM(dbo.fncRemove_Html_String(@Resultado, 1)))
        SET @StringSemHTML = NULLIF(LTRIM(RTRIM(dbo.fncRecupera_Letras_Sem_Caractere_Especial(@StringSemHTML, 1, '', 1) COLLATE SQL_Latin1_General_CP1_CI_AI)), '')

        IF (LOWER(@Resultado) LIKE '%ncia%correio%' OR LOWER(@Resultado) LIKE '%ncia%filatelica%' OR LOWER(@Resultado) LIKE '%agf %')
            SET @contaLinhas = 1 
        ELSE 
            IF (@StringSemHTML IS NOT NULL AND @StringSemHTML <> 'LOCALIZAÇÃO')
                SET @contaLinhas += 1


        INSERT INTO @Retorno(texto, ranking)
        SELECT @StringSemHTML, @contaLinhas
        WHERE @StringSemHTML IS NOT NULL
        AND @StringSemHTML NOT IN ('Sábado', 'Domingo', 'Horário de almoço', 'Plantão no Sábado', 'Plantão no Domingo', 'HORÁRIO DE ATENDIMENTO', 'LOCALIZAÇÃO')
        AND @StringSemHTML NOT LIKE 'Segunda a sexta%'
        AND @StringSemHTML NOT LIKE 'Sábado%'


        SET @Contador = CHARINDEX(@Resultado, @Ds_Retorno_OUTPUT, @Contador) + LEN(@Resultado)


    END


    DECLARE @id_grupo INT = 0

    SET @Contador = 1
    SET @Total = (SELECT COUNT(*) FROM @Retorno)

    WHILE(@Contador <= @Total)
    BEGIN
        

        IF ((SELECT TOP(1) ranking FROM @Retorno WHERE id = @Contador ORDER BY id) = 1)
            SET @id_grupo += 1


        UPDATE @Retorno
        SET id_grupo = @id_grupo
        WHERE id = @Contador


        SET @Contador += 1

    END

    
    DECLARE @Tabela_Final TABLE (
        Ds_Agencia VARCHAR(50) NOT NULL,
        Ds_Tipo_Agencia VARCHAR(50) NOT NULL,
        Ds_Endereco VARCHAR(100) NOT NULL,
        Ds_Complemento VARCHAR(100) NULL,
        Ds_Bairro VARCHAR(100) NOT NULL,
        Ds_Municipio VARCHAR(100) NOT NULL,
        Ds_Cidade VARCHAR(100) NOT NULL,
        Sg_UF VARCHAR(2) NOT NULL,
        Ds_CEP VARCHAR(10) NOT NULL
    )


    SET @Contador = 1
    SET @Total = (SELECT COUNT(DISTINCT id_grupo) FROM @Retorno)

    WHILE(@Contador <= @Total)
    BEGIN
        

        INSERT INTO @Tabela_Final
        SELECT 
            MAX(CASE WHEN ranking = 1 THEN LTRIM(RTRIM(LEFT(texto, 32))) ELSE NULL END) AS Ds_Agencia,
            MAX(CASE WHEN ranking = 1 THEN LTRIM(RTRIM(SUBSTRING(texto, 32, 9999))) ELSE NULL END) AS Ds_Tipo_Agencia,
            MAX(CASE WHEN ranking = 2 THEN LTRIM(RTRIM(REPLACE(texto, 'Endereço ', ''))) ELSE NULL END) AS Ds_Endereco,
            MAX(CASE WHEN ranking = 3 THEN NULLIF(LTRIM(RTRIM(REPLACE(texto, 'Complemento', ''))), '') ELSE NULL END) AS Ds_Complemento,
            MAX(CASE WHEN ranking = 4 THEN LTRIM(RTRIM(REPLACE(texto, 'Bairro ', ''))) ELSE NULL END) AS Ds_Bairro,
            MAX(CASE WHEN ranking = 5 THEN LTRIM(RTRIM(REPLACE(texto, 'Município ', ''))) ELSE NULL END) AS Ds_Municipio,
            MAX(CASE WHEN ranking = 6 THEN LTRIM(RTRIM(REPLACE(texto, 'Cidade ', ''))) ELSE NULL END) AS Ds_Cidade,
            MAX(CASE WHEN ranking = 7 THEN LTRIM(RTRIM(REPLACE(texto, 'UF ', ''))) ELSE NULL END) AS Sg_UF,
            MAX(CASE WHEN ranking = 8 THEN LTRIM(RTRIM(REPLACE(texto, 'CEP ', ''))) ELSE NULL END) AS Ds_CEP
        FROM 
            @Retorno
        WHERE
            id_grupo = @Contador


        SET @Contador += 1


    END


    SELECT * FROM @Tabela_Final
   

END


/*

EXEC dbo.stpBusca_Agencias_Correios
    @Ds_Cidade = 'Vitória', -- varchar(50)
    @Sg_UF = 'ES'      -- char(2)

*/

Source code for SQLCLR objects

Above, I provided the source for creating the ready-made assembly with the SQLCLR functions and also the database objects (functions and Stored Procedure) to use the assembly. This makes using these objects much easier and does not require knowledge of C# or having Visual Studio on your machine to compile the SQLCLR DLL.

If you are someone who likes to know everything in depth and understand the SQLCLR code, I will also make the original code of these objects available here, for you to include in your SQLCLR project, for example.

If you don't know SQLCLR, you can't miss reading the articles below:
Introduction to SQL CLR (Common Language Runtime) in SQL Server
SQL Server – Performance comparison between Scalar Function and CLR Scalar Function
Other posts about SQLCLR

Source code for Procedure stpWs_Requisicao (C#):
View source code

using System.Data.SqlTypes;
using System.Net;
using System.IO;
using System.Text;

public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void stpWs_Requisicao(SqlString Ds_Url, SqlString Ds_Metodo, SqlString Ds_Parametros, SqlString Ds_Codificacao, out SqlString Ds_Retorno_OUTPUT)
    {

        var parametros = (Ds_Parametros.IsNull) ? "" : Ds_Parametros.Value;
        var metodo = (Ds_Metodo.IsNull) ? "GET" : Ds_Metodo.Value;
        var url = Ds_Url.Value;
        var feedData = string.Empty;
        var encoding = (Ds_Codificacao.IsNull) ? "UTF-8" : Ds_Codificacao.Value;

        if (string.IsNullOrEmpty(encoding))
            encoding = "UTF-8";

        var request = (HttpWebRequest) WebRequest.Create(url);
        request.Method = metodo;
        
        if (metodo == "POST" && parametros.Length > 0)
        {
            var data = parametros;
            var dataStream = Encoding.UTF8.GetBytes(data);

            request.ContentLength = dataStream.Length;

            using (var newStream = request.GetRequestStream())
            {
                newStream.Write(dataStream, 0, dataStream.Length);
                newStream.Close();
            }
        }


        using (var response = (HttpWebResponse) request.GetResponse())
        {
            using (var stream = response.GetResponseStream())
            {
                if (stream != null)
                {
                    using (var streamReader = new StreamReader(stream, Encoding.GetEncoding(encoding)))
                    {
                        feedData = streamReader.ReadToEnd();
                    }
                }
            }
        }

        Ds_Retorno_OUTPUT = feedData;

    }

};

fncCodifica_URL source code (C#):
View source code

using System;
using System.Data.SqlTypes;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncCodifica_URL(SqlString Ds_URL)
    {
        return Ds_URL.IsNull ? SqlString.Null : Uri.EscapeUriString(Ds_URL.Value.Replace("\\\\", "http://").Replace("\\", "/"));
    }
}

fncRemove_Acentuacao source code (C#):
View source code

using System.Text;
using System.Data.SqlTypes;
using System.Globalization;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncRemove_Acentuacao(SqlString Ds_Texto)
    {

        if (Ds_Texto.IsNull)
            return null;

        var s = Ds_Texto.Value.Normalize(NormalizationForm.FormD);
        var sb = new StringBuilder();

        foreach (var t in s)
        {
            var uc = CharUnicodeInfo.GetUnicodeCategory(t);
            if (uc != UnicodeCategory.NonSpacingMark)
            {
                sb.Append(t);
            }
        }

        return sb.ToString();

    }
};

fncRegex_Match source code (C#):
View source code

using System;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncRegex_Match(SqlString Ds_Texto, SqlString Ds_Mascara)
    {

        if (Ds_Texto.IsNull || Ds_Mascara.IsNull)
            return SqlString.Null;

        try
        {
            var resultado = Regex.Match(Ds_Texto.Value, Ds_Mascara.Value);
            return resultado.Success ? resultado.Groups[0].Value : SqlString.Null;
        }
        catch (Exception e)
        {
            return SqlString.Null;
        }

    }
}

fncRemove_Html_String source code (C#):
View source code

using System;
using System.Net;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static string fncRemove_Html_String(string Ds_String_HTML, bool Fl_Quebra_Linha)
    {

        if (string.IsNullOrEmpty(Ds_String_HTML))
            return null;


        var html = Ds_String_HTML;
        html = WebUtility.HtmlDecode(html);

        if (Fl_Quebra_Linha)
        {
            html = Regex.Replace(html, "<br/>", Environment.NewLine, RegexOptions.IgnoreCase);
            html = Regex.Replace(html, "<br />", Environment.NewLine, RegexOptions.IgnoreCase);
            html = Regex.Replace(html, "<br>", Environment.NewLine, RegexOptions.IgnoreCase);
        }

        html = Regex.Replace(html, " ", " ", RegexOptions.IgnoreCase);
        html = Regex.Replace(html, "<.*?>", string.Empty);

        return html;
    }
}

fncRecupera_Letras_Sem_Caractere_Especial source code (C#):
View source code

using System.Data.SqlTypes;
using System.Globalization;
using System.Linq;
using System.Text;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncRecupera_Letras_Sem_Caractere_Especial(SqlString Ds_Texto, SqlBoolean Fl_Alfanumerico, SqlString Ds_Substituir_Por, SqlBoolean Fl_Aceita_Acentos)
    {

        if (Ds_Texto.IsNull)
            return SqlString.Null;

        if (Fl_Alfanumerico.IsNull)
            return SqlString.Null;

        var aceitaAcentos = (!Fl_Aceita_Acentos.IsNull) && Fl_Aceita_Acentos.Value;
        var substituirPor = (Ds_Substituir_Por.IsNull) ? "" : Ds_Substituir_Por.Value;
        var somenteAlfanumerico = Fl_Alfanumerico.Value;
        var normalizedArray = aceitaAcentos ? Ds_Texto.Value.ToCharArray() : Ds_Texto.Value.Normalize(NormalizationForm.FormD).ToCharArray();
        var builder = new StringBuilder();

        foreach (var ch in normalizedArray.Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) != UnicodeCategory.NonSpacingMark))
        {

            if (somenteAlfanumerico)
            {

                if (aceitaAcentos)
                {

                    if ((ch >= 65 && ch <= 90) || (ch >= 97 && ch <= 122) || (ch >= 48 && ch <= 57) || ch == 32 || (ch >= 192 && ch <= 207) || (ch >= 209 && ch <= 214) || (ch >= 217 && ch <= 221) || (ch >= 224 && ch <= 228) || (ch >= 231 && ch <= 239) || (ch >= 241 && ch <= 246) || (ch >= 249 && ch <= 253))
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
                else
                {

                    if ((ch >= 65 && ch <= 90) || (ch >= 97 && ch <= 122) || (ch >= 48 && ch <= 57) || ch == 32)
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
            }
            else
            {

                if (aceitaAcentos)
                {

                    if ((ch >= 32 && ch <= 126) || (ch >= 192 && ch <= 207) || (ch >= 209 && ch <= 214) || (ch >= 217 && ch <= 221) || (ch >= 224 && ch <= 228) || (ch >= 231 && ch <= 239) || (ch >= 241 && ch <= 246) || (ch >= 249 && ch <= 253))
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
                else
                {

                    if (ch >= 32 && ch <= 126)
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
            }
        }
        
        return builder.ToString();

    }
}

That's it, folks!
I hope you enjoyed this post and see you next time!

Hug!