I have a string:
line = "https://dbwebb.se/kunskap/uml#sequence, ftp://bth.com:32/files/im.jpeg, file://localhost:8585/zipit, http://v2-dbwebb.se/do%hack"
I want to get a result like this :
[('https', 'dbwebb.se', ''), ('ftp', 'bth.com', '32'), ('file', 'localhost', '8585'), ('http', 'v2-dbwebb.se', '')]
I tried like :
match = re.findall("([fh]t*ps?|file):[\\/]*(.*?)(:\d+|(?=[\\\/]))", line)
And than i got :
[["https", "dbwebb.se", ""], ["ftp", "bth.com", ":32"], ["file", "localhost", ":8585"], ["http", "v2-dbwebb.se", ""]]
There is one diffrence, you can se ":32" and ":8585". How can i do to get just "32" and "8585" and not the stupid ":" Thanx!