Skip to content

A Go (Golang) package designed for extracting, parsing, and manipulating URLs with ease. This library is useful for developers who need to work with URLs in a structured way.

License

Notifications You must be signed in to change notification settings

hueristiq/hqgourl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hqgourl

go report card open issues closed issues license maintenance contribution

hqgourl is a Go (Golang) package designed for extracting, parsing, and manipulating URLs with ease. This library is useful for developers who need to work with URLs in a structured way.

Resources

Features

  • Flexible URL Extraction: Extract URLs from text using regular expressions.
  • Domain Parsing: Parse domains into subdomains, root domains, and top-level domains (TLDs).
  • Extended URL Parsing: Extend the standard net/url package in Go with additional fields and capabilities.

Installation

To install the package, run the following command in your terminal:

go get -v -u github.com/hueristiq/hqgourl

This command will download and install the hqgourl package into your Go workspace, making it available for use in your projects.

Usage

Below are examples demonstrating how to use the different features of the hqgourl package.

URL Extraction

You can extract URLs from a given text string using the URLExtractor. Here's a simple example:

package main

import (
    "fmt"
    "github.com/hueristiq/hqgourl/pkg/extractor"
    "regexp"
)

func main() {
    extr := extractor.NewURLExtractor()
    text := "Check out this website: https://example.com and send an email to info@example.com."
    
    regex := extr.CompileRegex()
    matches := regex.FindAllString(text, -1)
    
    fmt.Println("Found URLs:", matches)
}

Customizing URL Extraction

You can customize how URLs are extracted by specifying URL schemes, hosts, or providing custom regular expression patterns.

  • Extract URLs with Specific Schemes (e.g., HTTP, HTTPS, FTP):

    extr := extractor.NewURLExtractor(
        extractor.URLExtractorWithSchemePattern(`(?:https?|ftp)://`),
    )

    This configuration will extract only URLs starting with http, https, or ftp schemes.

  • Extract URLs with Custom Host Patterns (e.g., example.com):

    extr := extractor.NewURLExtractor(
        extractor.URLExtractorWithHostPattern(`(?:www\.)?example\.com`),
    )

    This setup will extract URLs that have hosts matching www.example.com or example.com.

Note

Since API is centered around regexp.Regexp, many other methods are available

Domain Parsing

The DomainParser can parse domains into their components, such as subdomains, root domains, and TLDs:

package main

import (
    "fmt"
    "github.com/hueristiq/hqgourl/pkg/parser"
)

func main() {
    dp := parser.NewDomainParser()

    parsedDomain := dp.Parse("subdomain.example.com")

    fmt.Printf("Subdomain: %s, Root Domain: %s, TLD: %s\n", parsedDomain.Sub, parsedDomain.Root, parsedDomain.TopLevel)
}

URL Parsing

The URLParser provides an extended way to parse URLs, including additional fields like port and file extension:

package main

import (
    "fmt"
    "github.com/hueristiq/hqgourl/pkg/parser"
)

func main() {
    up := parser.NewURLParser()

    parsedURL, err := up.Parse("https://subdomain.example.com:8080/path/file.txt")
    if err != nil {
        fmt.Println("Error parsing URL:", err)

        return
    }

    fmt.Printf("Subdomain: %s\n", parsedURL.Domain.Sub)
    fmt.Printf("Root Domain: %s\n", parsedURL.Domain.Root)
    fmt.Printf("TLD: %s\n", parsedURL.Domain.TopLevel)
    fmt.Printf("Port: %d\n", parsedURL.Port)
    fmt.Printf("File Extension: %s\n", parsedURL.Extension)
}

Set a default scheme:

up := parser.NewURLParser(parser.URLParserWithDefaultScheme("https"))

Contributing

We welcome contributions! Feel free to submit Pull Requests or report Issues. For more details, check out the contribution guidelines.

Licensing

This package is licensed under the MIT license. You are free to use, modify, and distribute it, as long as you follow the terms of the license. You can find the full license text in the repository - Full MIT license text.

Credits

Contributors

A huge thanks to all the contributors who have helped make hqgourl what it is today!

contributors

Similar Projects

If you're interested in more packages like this, check out:

DomainParserurlxxurlsgoware's tldomainsjakewarren's tldomains

About

A Go (Golang) package designed for extracting, parsing, and manipulating URLs with ease. This library is useful for developers who need to work with URLs in a structured way.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published